September 1959 


Vol. 15 No. 3 


JOURNAL OF THE BIOMETRIC SOCIETY 


Biometric Method, Past, Present, and Future J. O. Irwin 
Optimum Group Size in Half-Sib Family Selecticn J. M. Rendel 


Pair Comparison, With and Without Ties N. T. Gridgeman 
A Method for Testing Treatment Effects in the 

Presence of Learning Seymour Geisser 
On the Development of Clinical Statistical Systems 

for Psychiatry J. B. Chassan 


The Comparison of the Sensitivities of Similar 
Experiments: Model II of the Analysis of 
Variance D. E. W. Schumann and 


R. A. Bradley 


The Use of the Power Function to Determine 
an Adequate Number of Progeny per Sire in 


a Genetic Experiment Involving Half-Sibs Stanley Wearden 
Confidence Limits for the LD,,. Using the Moving 

Average-Angle Method Eugene K. Harris 

Contribution to the Study of Grouped Observations. 

IV. Some Comments on Simple Estimates N. F. Gjeddebeek 
Some Recent Results in Chi-Square Goodness-of- 

Fit Tests GC. S. Watson 
The Sampling Variance of the Genetic Correlation 

Coefficient Alan Robertson 


Queries and Notes 


Test of Difference Between Treatment and Control 
with Multiple Replications of Control and a 


Missing Plot David Hogben 


#4 


THE BIOMETRIC SOCIETY 


The Biometric Society is an international society devoted to the mathematical 
and statistical aspects of biology. Biologists, mathematicians, statisticians, and 
others interested in its objectives are invited to become members. Through its 
regional organizations the Society sponsors regional and local meetings. National 
secretaries serve the interests of members in Denmark, India, Japan, the Netherlands, 
Sweden, and Switzerland, and there are many members at large. 

Biometrics, the journal of the Biometric Society, is published quarterly. Its 
general objects are to promote and to extend the use of mathematical and statistical 
methods in pure and applied biological eciences, by describing and exemplifying 
developments in these methods and their applications in a form readily assimilable 
by experimental scientists. It is also intended to provide a medium for exchange of 
ideas by experimenters and those concerned primarily with analysis and the develop- 
ment of statistical methodology. 

Original papers, and authoritative expository or review articles or critiques, 
will be accepted for publication in Biometrics if judged consistent with these general 
aims. Predominantly analytical or methodological papers should contribute specif- 
ically to the formulation of quantitative hypotheses, to the interpretation of data, 
or to the planning or analysis of experiments or surveys. Papers dealing with bio- 
logical subjects should report conclusions of definite applicability reached by mathe- 
matical or statistical analysis, so described as to facilitate possible use of the procedure 
in other fields of biology or related sciences. 

Technical notes or problems for consideration under the heading of Queries 
and Notes are invited. 

Information for contributors is given on the inside back cover. 
Annvat Duss AND MEMBERSHIPS FOR 1959 
Memberships (including dues and subscriptions to this journal) 

Membership (U.S.A. and Canada) $ 

Student membership (U.S.A. and Canada) 

Membership for others 

Student membership for others 

Sustaining membership (including two subscriptions to this journal) 100. 00 
Subscriptions 

Non-members of the Biometric Society 7.00 

All dues, memberships, and subscriptions are payable in U.S.A. currency. In- 
formation concerning the Society and memberships may be obtained from its Secre- 
tary, M. J. R. Healy, Statistics Department, Rothamsted Experimental Station, 
Harpenden, Herts., England. Non-member subscriptions are payable to the Manag- 
ing Editor, Biometrics, Department of Statistics, The Florida State University, 
Tallahassee, Florida, U.S.A. 


Sages 


BUSINESS OFFICE OF BIOMETRICS: Department of Statistics, Florida 
State University, Tallahassee, Florida, U.S.A. Non-member subscriptions, changes 
of address for non-member subscriptions, and undeliverable copies should be sent 
to this office. 

BUSINESS OFFICE OF THE SOCIETY: 509 West Hill Road, Knoxville 19, 
Tennessee, U.S.A. Members of the Biometric Society are advised to send changes 
of address and all correspondence regarding members’ subscriptions to the business 
office of the Society. 

Second-class mailing privileges authorized at Blacksburg, Virginia, with ad- 
ditional entry at Richmond, Virginia. Biometrics is published quarterly—in March, 
June, September, and December. 


| 

ew 


4 
4 
a 4 


The Biometric Society 


S53 = =: 


FounpDED By THE Biometrics SE 


ON OF THE AMERI 


CAN STATISTICAL ASSOCIATION 


TABLE oF CONTENTS 


Biometric Method, Past, Present, and Future . J.O. Irwin 
Optimum Group Size in Half-Sib Family Selection J. M. Rendel 376 
Pair Compairison, With and Without Ties. . N.T.Gridgeman 382 


A Method for Testing Treatment Effects in the Presence of 


Seymour Geisser 389 
On the Development of Clinical Statistical Systems for Psy- 
. . . J.B. Chassan 396 


The Comparison of the Sensitivities of Similar Experiments: 
Model II of the Analysis of Variance 
D. E. W. Schumann and R. A. Bradley 405 


The Use of the Power Function to Determine an Adequate Num- 
ber of Progeny per Sire in a Genetic Experiment Involving 


... . . Stanley Wearden 417 
. Confidence Limits for the LD;, sig the Moving Average-Angle 
... . . . Eugene K. Harris 424 
Pe Contribution to the Study of sieaienid Observations. IV. Some 
Comments on Simple Estimates . . . . N. F. Gjeddebek 433 7 


Some Recent Results in Chi-Square Goodness-of-Fit Tests 

G.S. Watson 440 
The Sampling Variance of the Genetic Correlation Coefficient 
Alan Robertson 
Queries and Notes 


Test of Difference Between Treatment and Control with 
Multiple Replications of Control and a Missing Plot 
David Hogben 
Abstracts 
The Biometric Society 


News and Announcements 


Number 3 September 1959 Volume 15 


3 
| 
if: 
— = 

< 
48 ) 
489 
4 
| 


BIOMETRICS 


Editor 
Ralph A. Bradley 


Assistant to the Editor: Jane L. Worley 


Editorial Board 


Editorial Associates and Committee Members: C. I. Bliss, Irwin Bross, E. A. 
Cornish, S. Lee Crump, H. A. David, W. J. Dixon, Mary Elveback, D. J. Finney, 
J. W. Hopkins, O. Kempthorne, Leopold Martin, Horace W. Norton, 8. C. Pearce, 
and Georges Teissier. Managing Editor: Ralph A. Bradley; Assistant Managing 
Editor: W. A. Glenn. 


Former Editors 


Gertrude M. Cox—Founding Editor 
John W. Hopkins—Past Editor 


Officers of the Biometric Society 


General Officers 


President: C. H. Goulden; Secretary: M. J. R. Healy; Treasurer: A. W. Kimball; 
Council: M. S. Bartlett, W. U. Behrens, C. I. Bliss, G. E. P. Box, A. Bradford 
Hill, A. Buzzati-Traverso, L. L. Cavalli-Sforza, D. G. Chapman, G. Darmois, W. J. 
Dixon, C. W. Emmens, Sir Ronald A. Fisher, C. G. Fraga, Jr., Anne Lenger, C. C. 
Li, K. Mather, C. R. Rao, P. V. Sukhatme, G. Teissier, J. W. Tukey, G. S. Watson. 


Regional Officers 
Region President Secretary Treasurer 
Australasian C. W. Emmens P. J. Claringbold N. R. Sobey 
Belgian and 
Belgian Congo P. P. Denayer L. Martin A. H. L. Rotti 
Brazilian F. P. Gomes P. M. Freire A. Groszmann 
British J. O. Irwin C. C. Spicer P. A. Young 
E. N. American J. Cornfield T. W. Horner T. W. Horner 
(W. T. Federer) 
French A. Vessereau D. Schwartz D. Schwartz 
German A. Augsberger R. Wette M. P. Geppert 
Italian G. Montalenti R. Scossiroli F. Sella 
W. N. American J. L. Hodges, Jr) ©.M.M.Sandomire M. M. Sandomire 
National Secretaries 
Denmark N. F. Gjeddebek Netherlands E. van der Laan 
India K. Kishen Sweden H. A. O. Wold 
Japan M. Hatamura Switzerland H. L. LeRoy 


irae 
4 
| 
| 
‘ 


BIOMETRIC METHOD, PAST, PRESENT, AND FUTURE* 


J. O. Inw1n 


Visiting Professor, Department of Biostatistics 
University of North Carolina, Chapel Hill, North Carolina, U.S.A. 


You have done me a great honour in asking me to address you. When 
I asked myself why, I concluded that it was not because I could offer 
you anything new and original in biometrical practice or theory. 
That is the privilege and the opportunity of younger people many of 
whom are present tonight. However, it has been my good fortune to 
have lived in the field for many years and to have known many of the 
pioneers. I have therefore chosen a historical subject. It may be a 
good thing, once in a while to look back, to see how we have got where 
we are, in the hope of advancing more effectively into the future. 


Meaning of Biometry 


Tonight three societies are meeting together. The explanation of 
the joint meeting is to be found in the nature of biometric method. What 
is biometry? A variety of definitions is possible. We should all agree 
however that it is something to do with life and something to do with 
measurement. As the word is now used, there is a third notion involved, 
the notion of interpretation. How widely the terms “‘life” and “‘measure- 
ment” are to be taken may be open to question. Whatever may be the 
case with the word “biometry,” it is undoubtedly advantageous that 
the term “biometrical method” should be taken in the widest possible 
sense. 

The term “life” for this purpose may be extended to include any 
aggregate the individuals of which vary and exhibit a certain (at least 
apparent) spontaneity of behaviour. According to A. N. Whitehead, 
whose later years were spent to such advantage in America, in the 
endeavour to build up a system adequate to describe the facts—aesthetic, 
cosmological, and sociological—-of the universe as it presents itself to 
modern man, the ultimate realities are “occasions of experience.” Such 


*Invited address at the joint spring meeting of the Biometric Suciety (ENAR), the Institute of 
Mathematical Statistics, and the Physical Science Section of the American Statistical Association. 


363 


> 
: 
+) 
Bip: 
| 
he 
= aly 
& 
i] 


364 BIOMETRICS, SEPTEMBER 1959 


an “occasion of experience” in his sense of the term need have nothing 
to do with living material. It is of the essence of each ‘occasion of 
experience”’ that it is in part ‘“determined” and in part “‘self-creative.”’ 
On this view the ultimate realities possess a stochastic element. It 
would follow that biometric method need not be limited to any one field 
of subject matter. 

The term “measurement” can, for this purpose, also be widely 
interpreted. It includes physical measurement in the strictest possible 
sense; it includes the recording of qualities, and also the region inter- 
mediate between the two. Perhaps most important of all, it includes 
counting. One of the tasks of biometrical method is to find the appropri- 
ate techniques for dealing with each kind of measurement. 

The object of biometrical method is to make general statements 
about the properties of groups from measurements made on individuals 
belonging to them. This involves estimation, inference, and interpre- 
tation. The function of mathematical statistics is to help to make this 
possible. Without mathematical statistics the subject could not exist; 
as we know it, it could not have been developed; nevertheless I believe 
that mathematical statistics are necessary but not in general sufficient 
for this purpose. 

In primitive man, observation preceded conscious theorising. He 
looked out on the world and noticed the change in the seasons. As soon 
as primitive man or primitive woman said, ‘When we sow seed in the 
spring, we get a crop in the autumn,” an induction had been made and a 
hypothesis formed, one which could be tested by further observation. 
Thus observation and theory have developed together from the very 
beginning. And so, as we might expect, there are, in modern biometric 
method, two main sources to the stream of development. One goes 
back to the early workers at the beginning of the seventeenth century, 
who first thought of making numerical measurements, recording qualities 
and then aggregating the results. The other goes back to the seventeenth 
and eighteenth century writers on the theory of probability and to 
Gauss in the early nineteenth century. The former line of descent leads 
to Quetelet and Galton in the nineteenth century—and here we should 
also remember Florence Nightingale whose enthusiasm for medical 
statistics amounted to a passion. The two streams unite in Karl 
Pearson and his school, Edgeworth and R. A. Fisher. 


Development of the Observational Side of the Subject 


It seems natural to deal first with the former source. If we use the 
most general sense of the term “measurement,” observational biometry 
can be classified into the following three categories: 


h? 
4 
| 
i 
1 


BIOMETRIC METHOD 365 


(1) Measurements made in relation to a single individual, 

(2) Measurements or numerical statements relating to small groups 
of individuals, 

(3) Measurements or numerical statements relating to large groups 
of individuals or even to populations as a whole. 


This classification was used by Underwood [1951] in giving a historical 
account of the development of medical statistics, but it is equally 
applicable to the subject as a whole. 

We are not concerned with the first as such. Yet we have to know 
how to measure a single individual, before we can deal with groups. 
This necessitated the invention of instruments of measurement. In 
the field of medical statistics, for example, instruments had to be devised 
for measuring such physical variables as temperature, pulse rate, blood 
pressure, and vital capacity, for making blood counts and haemoglobin 
estimations, for dealing with calorimetry and metabolism, and for 
making chemical examinations such as those of urine, gastric juice, 
blood sugar, and cerebrospinal fluid. This, it seems, started with 
Sanctorius of Padua (1561-1636) who had primitive instruments for 
measuring temperature and pulse rate—and the development has con- 
tinued until the present day. 

From the second group, twentieth century biometrics has developed. 
This group is distinguished by the opportunities it provides for con- 
trolled experiment. Modern knowledge of the principles of experimental 
design were first applied in agricultural science, but have been used 
more recently in ever widening fields. Instances are the study and 
control of the physical and chemical processes of industrial production, 
problems of bacteriology and virology, psychological study of cognitive, 
affective, and conative processes in relation to the environment—for 
example, the nature of learning processes or of fatigue—physiological 
studies such as the effect on animals of carcinogens, or of radiation, 
and psycho-physiological studies such as the effect of hot climates. 
Among the most noteworthy of recent fields of application are clinical 
trials and biological assay. 

However, we are anticipating! Until about 1915 biometric method 
had concerned itself only with the third group. This group includes 
economic statistics in general and the whole subject of vital statistics 
and demography with its actuarial applications. 

It is interesting to see how the latter subject developed. Of course 
information about the numbers and condition of the people was collected 
by the most ancient governments, but the beginnings of the subject in 
its modern sense were associated with people whose main interest was 


Bie: 
4 
1, 


366 BIOMETRICS, SEPTEMBER 1959 


in observing, recording, or else studying by simple arithmetical methods, 
observations already made. Science was less specialised and the pioneers 
often applied these methods in many fields. For example the first 
life-table was constructed by Halley. Halley’s better known astro- 
nomical work was done by careful study of observations. To digress 
for a moment, so was Kepler’s! Purely empirically Kepler found that the 
planets moved in ellipses with the sun in the focus and that the squares 
of the periodic times varied as the cubes of the distances. It was left to 
Newton to construct the theoretical model. (Incidentally in his book 
on probability Harold Jeffreys remarks that a modern significance test 
would have shown that Kepler’s observations were inconsistent with 
Kepler’s deductions, which nevertheless provided the basis for the 
Newtonian Theory.) 

Halley’s life-table was published in the Philosophical Transactions 
of the Royal Society of London for 1693. It is entitled, ‘“‘An estimate of 
the degrees of the Mortality of Mankind drawn from curious Tables of the 
Births and Funerals in the City of Breslau, with an attempt to ascertain 
the price of annuities upon lives.” Halley obtained his survivors at ages 
by adding up the deaths; in other words he assumed a stationary 
population. Modern demographers and actuaries would consider this 
very shocking, but the results were reasonable and his table provided a 
basis for all future work. 


Development of Theory 


We cannot in the short time at our disposal trace the development 
of probability theory through the late seventeenth, eighteenth, and 
early nineteenth centuries. However, it is a fact that many of the 
results believed to be modern are to be found in the work of the early 
writers. For example, when—about 1925—I derived a general form 
for the sampling distribution of the mean by a Fourier inversion of 
what is now called the characteristic function, I thought I had found 
something new. But the result, for the rectangular distribution at any 
rate, was discovered by Lagrange. However the earliest writers con- 
fined their discussion to games of chance; the most noteworthy con- 
tributions to applied mathematics of the greatest figures of the eighteenth 
century such as Lagrange and Laplace were in other fields, mainly that 
of dynamics. De Moivre laid the foundations of actuarial method by his 
work on life-annuities and Gauss developed the theory of errors of 
observation. But the biological field, as usually understood, was hardly 
explored at all. 

We have to take a leap! The modern age of biometrical method 
started with Karl Pearson, born in 1857. Pearson started as a Cam- 


| 
| 
| 


BIOMETRIC METHOD 367 
bridge mathematician. He was appointed to the Chair of “Applied 
Mathematics” at University College London in 1884. His youthful 
interest in mediaeval literature, in law and in socialism, lie outside the 
scope of this talk. The “Grammar of Science’ which was largely 
written in 1890 is, however, relevant. One of the best accounts extant 
of what Science really does, it took the view that scientific laws are 
descriptions in “‘conceptual shorthand” of data derived from our sense 
perceptions. With the nature of the realities behind the sense-data 
Pearson refused to concern himself. He felt, perhaps unconsciously, 
that entanglement in metaphysical questions would have interfered 
with his power to develop, as he wished, the subject which he did in 
fact develop with immense drive. 

Bio-statistical techniques can be divided into those which are 
exploratory and those which are aimed at testing theoretical models. 
Pearson was interested in scientific theories; sometime as in his famous 
controversy over Mendelism with Bateson, he attacked them violently. 
But one feels that the predominant contribution to biometry of Pearson 
and his school was in the field of exploratory techniques. The techniques 
of correlation and regression which he developed theoretically were of 
this kind. He and his followers felt that by applying them to sufficiently 
large bodies of data and studying the results, we might learn to under- 
stand the nature of the processes going on. ‘The occasions when they 
tested particular hypotheses or ‘‘models” were relatively rare. 

Pearson’s system of frequency curves was based on an extremely 
general hypothesis, that “contributory cause groups” were at work, 
which would lend to hypergeometric distributions. This led him to the 
problem of representing such distributions by continuous frequency 
curves—which was solved by the use of the well known differential 
equation, expressing the parameters as functions of the first four 
moments. His main use of them was exploratory to find empirical 
formulae for observed frequency distributions. In fact they have also 
frequently been found valuable in getting an idea of what a theoretical 
distribution is like when we can find its moments. ‘Student’ (W. S. 
Gossett) discovered the exact sampling distribution of the variance from 
a normal universe, and the “?’” test, by this method; in this case the 
result, was later shown by Fisher to be exact. exploratory techniques 
will always be essential to biometry. Perhaps the best example, among 
techniques currently used, is that of “factor analysis.’””’ The viewpoint 
of the Grammar of Science accounts for the subsequent emphasis on 
exploratory techniques. 

The year 1890 was also important to Pearson for another reason. 
Weldon was appointed to the chair of Zoology at University College; 


a 
te 
| 
bis 


368 BIOMETRICS, SEPTEMBER 1959 


this stimulated Pearson’s interest in biology and it was probably through 
Weldon that he came to meet Francis Galton. This was the turning 
point in Pearson’s life. Galton was one of the greatest scientific figures 
of the nineteenth century and the first in England to see the possibilities 
of statistical method applied to biology, especially to heredity, but also 
to medicine, anthropometry, and psychology. He inspired Pearson 
with a lasting enthusiasm. Galton was not a mathematician and Pearson 
supplied this necessary specialised knowledge. Starting in the nineties 
Pearson published his “Mathematical Contributions to the Theory of 
Evolution” (1893-1900) in which he developed the theory of correlation 
and regression and the Pearsonian system of frequency curves. These 
appeared in the Philosophical Transactions of the Royal Society, and 
after some years resulted in the controversy with Bateson over 
Mendelism. Resulting difficulties with the Royal Society over publi- 
cation led to the foundation of the Biometrika in 1901. 

In the early years of the present century Pearson was still Professor 
of Applied Mathematics at University College, but he was running a 
Eugenics Laboratory and a Biometric Laboratory at the same time. 
He continued to have the most amazing energy. In 1911 Galton died 
and left money for the endowment of a Chair of “National Eugenics.” 
Pearson then resigned his Chair of Applied Mathematics and became 
the first Galton Professor. He called his department the ‘Department 
of Applied Statistics and Eugenics” and ran all these activities together: 
Statistical Theory and Practice and Biometry including Anthropometry 
and Eugenics. Ultimately he completed fifty years as a Professor at 
University College, London. In 1957, the century of his birth, a 
Memorial Lecture was delivered by J. B. S. Haldane. The following 
passage from Haldane’s tribute seems to me well balanced and just: 

“I believe that his theory of heredity was incorrect in some funda- 
mental respects. So was Columbus’ theory of geography. He set out 
for China and discovered America. But he is not regarded as a failure 
for this reason. When I turn to Pearson’s great series of papers on the 
mathematical theory of evolution, published in the last years of the 
nineteenth century, I find that the theories of evolution now most 
generally accepted are very far from his own. Bud I find that in the 
search for a self-consistent theory of evolution he devised methods 
which are not only indispensable in any discussion of evolution. ‘They 
are essential in every serious application of statistics to any problem 
whatever. If for example I wish to describe the distribution of British 
incomes, the response of different individuals to a drug, or the results of 
testing materials used in engineering, I must start off from the founda- 
tion laid in his memoir, ‘Skew variation in homogeneous material.’ 


4 
ars 


BIOMETRIC METHOD 369 


After sixty-three years I shall certainly take some short cuts through 
the jungle of his formulae, some of which he himself made in later 
years. Very few ships today follow Columbus’ course across the 
Atlantic! 

“Tet me put the matter another way. Anyone reading the controversey 
between Pearson and Weldon on one side, and Bateson and his colleagues 
on the other, which reached its culmination about fifty to fifty-five 
years ago, might have said, ‘I do not know who is right, but it is certain 
that at least one side is wrong.’ In fact both were right in essentials. 
The general theory of Mendelism is, I believe, correct in a broad way. 
But we can now see, that if Mendelism were completely correct, natural 
selection, as Pearson understood it, would not occur. For the frequency 
of one gene could never increase at the expense of another, except by 
chance, or as we now put it, sampling errors. It is just the divergence 
between observed results and theoretical expectations, to which Pearson 
rightly drew attention, which gives Mendelian genetics their evo- 
lutionary importance.” 

Edgeworth was slightly older than Pearson. He was for many 
years Professor of Economics at Oxford. While Pearson’s work and 
Pearson’s writing were characterized by definiteness and sharp clarity 
of outline (one might sometimes think he was wrong but one was never 
in doubt as to what he meant), Edgeworth was courtly, urbane, and 
diffident. His writing had a certain obscurity and there were frequent 
digressions with learned allusions and classical quotations. Perhaps 
that is why his statistical work, which was of first class originality and 
importance, is not better known than it is. He saw deeply into the 
subject. He discovered the general expression for frequency distri- 
butions in terms of the successive derivatives of the normal frequency 
function and used it to considerable advantage; and one can see fore- 
shadowed in his work both maximum likelihood and analysis of variance. 

A quotation from A. L. Bowley’s account of his work illustrates his 
manner. Most of this is in Edgeworth’s own words. 

“In 1908 it is again shown ‘how widely the subjective element 
enters into the calculus of probabilities’ for the whole must in the end 
be related to credibility; similarly utility must be ultimately relative 
to the feeling of happiness, if it is to have an intelligible meaning. In 
both economics and statistics classical writers seem to have over- 
estimated the precision of their statements owing to ‘undue confidence 
in untried methods of deduction.’ ‘There is even a similarity between 
the particular instruments in each department which have thus proved 
treacherous—here the postulate of complete independence between 
events; there the postulate of perfect competition between persons.’ 


Hie 
jee | 
Te 
ile 
| 
4 
“4 
TE 
bets | 
| 
| 


370 BLOMETRICS, SHPTEMBER 1959 


‘There is common to both studies a certain speculative or dialectical 
character, which recalls the ancient philosophies. “In wandering mazes 
lost”’ too often both pursue inquiries which seem to practical intellects 
interminable and uninteresting. These characteristics, it is to be 
feared, may seem to attach to inquiries, like those pursued in an earlier 
portion of this paper, into the a priori probabilities of various measure- 
ments. In economics too, other peoples’ mathematics are apt to resemble 


the 


“ce 


. “dark lantern of the spirit 
which none see but those that bear it.” ’ ” 


Yule, Greenwood, and “Student” were all Pearson’s pupils. Yule 
was the first. He ranks among the greatest of statisticians. He became 
lecturer in Statistics and a Fellow of St. John’s College, Cambridge, 
where he remained until he died. He had learning, humour, strong 
common sense, and originality. He put the theory of correlation and 
regression in a form in which it could be used by people of quite modest 
mathematical attainments. Much of this was incorporated in the 
famous textbook, An Introduction to the Theory of Statistics, in which 
the viewpoint is all his own. This, the first text book, still continues 
in its modern form, ‘Yule and Kendall” and for a general description 
of the field has never been surpassed. His originality is evidenced by 
his work with Willis on the distribution of species, by his studies of 
literary vocabulary, and (apart from the work of McKendrick which I 
have described elsewhere) by the earliest example of a stochastic 
process in the literature, his study of Wolfer’s sunspot numbers. 

Yule and Greenwood were closely associated. Greenwood’s work on 
medical statistics and epidemiology set up sign-posts which pointed the 
way to the present teaching of the subject in schools of public health. 

W. S. Gossett, who for most of his life wrote under the pseudonym 
of “Student” was a pupil of Pearson’s, originally a chemist. He spent 
the whole of his working life with Guinesses’ of Dublin and ultimately 
came to be Chief Brewer at their London Brewery of Park Royal. The 
Statistical Department, which he started, must have been the earliest of 
industrial statistical research departments. He early sensed the need 
for statistical method in industrial research, particularly the theory of 
small samples. He was not a professional mathematician; he obtained 
his theoretical results by simple algebra and extraordinary insight. 
The man who by such simple methods discovered the “‘?’” test and the 
sampling distribution of the correlation coefficient in the null case 
deserves to be described—as he was described by Sir Ronald lisher— 
as the Faraday of the subject. 


| 
4 
| 
| 
is 
‘of 
4 
i 
L 
i 
i 


BIOMETRIC METHOD 371 


A man of great) modesty and outstanding charm, “Student”? was 
never known to quarrel with anybody. He was on the friendliest of 
terms with both Karl Pearson and R. A. Fisher, and thus he formed a 
bridge between the older and newer biometric schools. 

The older schools had developed techniques which were mainly 
exploratory. Tisher developed the techniques necessary to make the 
subject useful in the field of scientific experiment. Further, his work 
has always emphasised the importance of building up relevant hypo- 
theses, or, as many people nowadays like to call them, ‘‘models,”’ which 
may be tested by suitable techniques applied to the data of observation 
and experiment. In summary, I should say, that the importance of 
his work lies: 


(1) In the development of the theory of small samples (and therefore 
of samples of any size), 

(2) In the systematic development of tests of significance, 

(3) The development of the theory of estimation, showing what can be 
done independently of probabilities a priori, 

(4) In laying down the principles of experimental design and providing 
us with a technique for analysing the results, 

(5) In the development of much of the technique of multivariate 
analysis and of discriminant function analysis. 

(6) In the reconciliation of the Mendelian and biometric views of 
heredity by working out the consequences of particulate inheritance 
when applied to populations at large and in the development of 
statistical methods appropriate to genetics. 


Sir Ronald Fisher is one of the few men whose fame has become 
legendary in his own lifetime. Towards the end of the 1914-18 war 
it occurred to Sir John Russell, who had become director of the 
Rothamsted Experimental Station, that they had a tremendous lot of 
figures dealing with about eighty years’ experiments, and perhaps a 
mathematician might be able to make something of them. So it came 
about, that in 1919 Fisher went to Rothamsted and started out, just 
with a table and a calculating machine, to see what he could make of 
the Rothamsted figures. In about ten years he had built up a statistical 
department, which was already becoming well known in a great many 
parts of the world. 

How did he do this? Well, he was (and is) a mathematical genius 
and he had great biological insight. He really understood the biologists’ 
point of view and was tremendously quick in the uptake, also able 
very rapidly to switch an extreme degree of concentration from one 
subject to another --apparently quite effortlessly. 


| 
Lay 
alt 
he 
j 
at 


372 BIOMETRICS, SEPTEMBER 1959 


Now biologists, at any rate those of the calibre who were at 
Rothamsted in the twenties, are very intelligent people. They know 
what they are doing and what they want to get at. They soon found, 
after talking with Fisher, that he understood their problems. When 
Fisher said to them, “‘Well, if you go and do this or that sort of calcu- 
lation with your results, you will get what you want,” they did not in 
the least know how these particular techniques could be justified—but 
they went away and tried. They soon found they got what they wanted. 

Wherever one goes in the world now, one finds experiments being 
designed and analysed by methods which are essentially Fisher’s. I 
wonder sometimes how many of the statistical departments and institutes 
of the world would be making the progress they are making, were it not 
for the work of Fisher. Perhaps, but for that, I should not be talking 
here tonight. 

Fisher started as a Cambridge mathematician. To Cambridge he 
returned as Professor of Genetics, and though retired from his chair, 
is still full of fertile ideas and scientific activity. There is none who 
could say with greater justification, ““Si monumentum requiris, cireum- 
spice.” 

The Present 


And so we come to the Present. Most of us regard the present as 
the period which includes the activities of our own contemporaries. As 
the years go by the meaning of the term tends to become more elastic. 
It stretches towards the past. At any rate, I feel that this is the right 
place to recall Neyman and E. 8. Pearson’s work on hypothesis testing. 
If it had done nothing more than introduce the notion of the power of a 
test, that alone would have made it of the greatest importance—and 
I believe it provides the basis of much theoretical work that is being 
done in America at the present time. It must have helped to inspire 
Abraham Wald. 

The second world war was a great stimulus to the development of 
statistical and biometric method. On the theoretical side this led, I 
think, through sequential analysis to decision functions and much of 
the work of the modern American school on statistical inference. With 
this work I am not sufficiently acquainted to be able to say very much. 
But I note that Lancelot Hogben (though he did not allude to the 
modern American school) has spoken recently of the crisis in con- 
temporary statistical theory. This may be an exaggeration, however 
there is much re-thinking going on about fundamentals, and of course 
quite a lot of controversy. We have all got to make up our minds 
where we stand with regard to statistical inference. Dr. L. J. Savage 
classifies theories of probability as objectivistic, necessary, or personalistic. 


' 
5 
@ 
3 


BIOMETRIC METHOD 


373 


To me it seems that there are important distinctions between different 
ways in which the “‘stochastic’”’ element can enter into our work. The 
“statistical models” which we build up are just as objective as other 
scientific “models” or “theories” which do not contain a stochastic 
element. When we test our “models” we use at least some logical 
processes which are necessary, and the question of prediction from them— 
which raises the greatest difficulties—involves subjective questions, 
whether we regard them as personalistic or, as does Harold Jeffries, 
common to all rationally-minded persons. 

The development, during the contemporary period, of stochastic 
process theory has been striking. Here we should not forget the earlier 
work of McKendrick, but I am thinking now, for example, of the work 
of Teller, Bartlett, D. G. Kendall, and their associates. The epi- 
demiological developments have been important; a good account is 
given in Norman Bayley’s recent book. But whether we look at the 
theory of cosmic rays, demography, the growth of bacterial populations, 
population genetics, industrial renewal theory, queues or economic time 
series, we can find instances of the application of stochastic process 
theory and signs that these are only forerunners of what is to come. 

The theory of communication and information (in Shannon’s sense) 
might also be classed under ‘“‘stochastic process theory” but is perhaps 
better considered as a subject in its own right. In listening to a recent 
talk on the theory of information by my colleague Dr. R. R. Kuebler, 
I could not but be struck bv his description of the possibilities it offers 
in relation to the understanding of the mechanisms of communication 
within the living body. 

On this strictly applied side, two of the most noteworthy achieve- 
ments during the contemporary period have been the work on clinical 
trials and biological assay. I think the success attained in these par- 
ticular fields was due to the fact that the initial stimulus came from 
people who thoroughly understood the subject matter of the field with 
which they were dealing—and yet had sufficient knowledge of statistical 
techniques to appreciate their possibilities. 

The work on clinical trials of Bradford Hill and his collaborators 
is distinguished by clear-headed adaptation of means to ends. They 
know what questions they want to answer; and devise the simplest 
means of answering them consistent with the canons of scientific method. 
They have always kept in mind the main desiderata for such trials. 
These have been described in a number of publications, but it is only 
necessary to call attention to two which I think they would put first: 
they must be ethical and should be directed to answering a few questions 
which must be formulated in absolutely precise terms. 

In connexion with biological assay one thinks particularly of the 


ta 
BY 
AS 
lee 
2 
“|| 


374 BIOMETRICS, SEPTEMBER 1959 


late Dr. J. W. Trevan, who spent. most of his working life in the Well- 
come Research Laboratories. He initiated a clear terminology in the 
first. place, and succeeded in getting across to the medical profession 
the importance and relevance to their work of the fact of animal varia- 
tion. The least dose of a drug that will kill a guinea pig,—incredible 
as it may seem now—was thought of earlier as being a fixed and in- 
variable quantity. His introduction of the notion of the L.D. 50 or in 
general the median effective dose of a drug, was a very great step in 
advance. 

Gaddum too was a pharmacologist to begin with. He worked with 
Trevan first and later at the National Institute of Medical Research, 
London. Percival Hartley, Trevan, and Gaddum made biological 
standardisation possible. I remember gratefully the encouragement 
the first of these three great men always gave me in developing the 
statistical side of the subject. In England, Gaddum’s 1933 report 
initiated the serious statistical study of biological assays depending 
on quantal responses; about the same time, in the United States, Bliss 
was considering independently the same sort of problems. He also. 
started from the biological end. 


The Future 


Finally in the strictly technical field we have the invention of 
electronic computers; and this brings us to the Future. It seems to me 
that this is what is most likely in the immediate future, to make startling 
if not revolutionary changes in statistical and biometric practice. The 
saving of time in all computational processes, the possibilities of direct 
numerical calculation where analytical expressions prove intractable— 
including Monte-Carlo methods—open up new vistas of possible 
progress. It is to be hoped that they will be used to the best advantage. 
We all of us have to guard against the danger of becoming the slaves of 
our own techniques, and here too it is important that man must remain 
the master of the machine. 

There seems no sign, as yet, that the enthusiasm for statistical and 
biometric methods will decline. All the indications are to the contrary. 
I would hazard a guess (and after all the literal meaning of the word 
“‘stochastic” is “‘conjectural’’) that the science of experimental design will 
continue to grow and that in the fields of stochastic processes and 
communication theory we shall see great advances. Sampling survey 
methods will continue to develop. As organisations grow in magnitude, 
still more demand will come from administrators for methods capable 
of dealing with the properties of groups, which it is the essence of our 
subject to provide. 


5 


BIOMETRIC. METHOD 375 


Yet we must remember that somewhere or other our subject has 
boundaries. Any particular science is in the literal sense an “abstrac- 
tion” and in that sometimes rather dim borderland which separates 
the entity abstracted from the greater whole from which it came, 
there lurks the possibility of mistakes and misapplications. It will be 
the task, one day, of some philosophically-minded person to explore 
this area more fully. Anyone who has reflected at all deeply on the 
distinction between vocational selection and vocational guidance will 
understand what I mean. 

As regards the future of statistical inference, it is my personal con- 
viction that we cannot get beyond the position, stated as early as 
Hume, that belief in the validity of inductive inference is in the last 
resort a matter of faith and faith only. That it is a faith in some degree 
shared by all mankind is perfectly true. I naturally believe therefore 
that differences of opinion will persist. I think there will be cycles of 
fashion in which now the objective and now the subjective views will be 
dominant, and that for all these reasons one might do worse than 
conclude with the advice tendered on a famous occasion by Oliver 
Cromwell, “I beseech you --+ think it possible you may be mistaken.” 

Now that I have come to the end of this talk I have the feeling that 
it has contained a lot of past, some present, and very little future. I 
can only plead in extenuation that these proportions are not very different 
from those in my own experience, which was my only justification for 
talking at all. 


t 
we 
| 
tal 
| 
Sy 
4 
| 
j 


OPTIMUM GROUP SIZE IN HALF-SIB FAMILY SELECTION 


J. M. RENDEL 
Commonwealth Scientific and Industrial Research Organization 


Animal Genetics Section, Zoology Department, Sydney University 
Sydney, Australia 


When indirect methods of selecting breeding stock have to be used 
there is often a limit to the number of offspring or other relatives that 
can be examined. Alan Robertson in a recent paper [1957] discussed 
the problem of deciding how many sires should be tested on how many 
offspring when selecting by progeny testing with a given size of test 
population. He shows that a general solution can be arrived at if the 
number of offspring tested for each sire is expressed in terms of the 
fraction of sires kept for breeding. If there are S sires kept for breeding 
and N animals tested all told, n for each sire, the proportion of sires 
selected, p, is nS/N since N/n is the number of sires tested. N/S, the 
number of tested animals for each sire kept, is Robertson’s testing 
ratio, K. It turns out that the genotypic superiority of selected sires 
is proportional to V1 + (a/pK) Z/p where a is (4 — h*)/h’, h’ is the 
fraction of variation that is genetic, and Z/p the superiority of selected 
sires in standard measure. This expression can be differentiated to 
give a value of K in terms of p, which maximises the genetic superiority 
of selected sires. . 

Robertson states that exactly the same formulae will hold in half-sib 
family selection. This is no doubt true enough for most practical 
purposes but is not strictly true. 

In progeny testing when a sire is chosen on the performance of his 
offspring, each being by a different dam, he is chosen to be used again 
on other dams, or if he is not to be used again his offspring, by dams 
other than those in the test, are to be kept. It is the assessment of the 
sire’s genotype which is decisive. The dams of his progeny are irrele- 
vant, except as a possible source of error or bias in his evaluation; they 
cannot contribute to his genotype. On the other hand, in the half-sib 
family selection system the dams are not irrelevant since it is members 
of the half-sibship that are used again or kept and, provided the dams 


376 


4 
] 
| 


OPTIMUM GROUP SIZE S77 
of the half-sibship remain the same and are properly represented in the 
test, the mean genotypic level of the dams contributes to the genotype 
of the half-sibship. Suppose, for example, cockerels are mated to a few 
hens each and their progeny are assessed on a sample of birds from each 
half-sibship, the information about the genotypic value of the half- 
sibship includes a fraction due to correlation between the mean geno- 
typic value of the few dams and the genotypic value of their progeny. 
The test estimates not only the genotypic value of the sire but also that 
of his mates. In half-sib selection it is quite possible, where the number 
of dams is not very large, that they will have a mean genotypic value 
different from the mean of the population from which they are drawn, 
and this will be reflected in the test of the half-sibship and in its geno- 
typic value. 

In the general formula given by Robertson, the expected genetic 
superiority of selected sires is 


AG = 7G, 


in which r is the selection intensity in standard units, r7¢ the correlation 
between the progeny average, J, and the breeding value of a sire, and 
o, is the genetic standard deviation. The correlation between the 
breeding value of a bull and the average performance of his daughters 
is given by the formula 


nih? 


where h? is the fraction of variance that is genetic and » is thé number 
of daughters and on the assumption that there are no non-genetic 
causes of correlation between the scores of a bull’s daughters (see 
Robertson, [1955]). In the general formula AG = rr;¢o0, , AG can stand 
for the superiority of a sibship instead of a bull, in which case rjg is 
the correlation between the average performance of a group, J, and its 
breeding value. r,¢ should take into account the correlation between 
the mean performance of the dams and their daughters. The correlation 
between the breeding value of a half-sibship and its mean score turns 
out to be 


(1 + nth 
Vn + (n — Din + 


If selection is being based on the index, J, genetic advance when 
progeny testing a sire is 


T1¢ 


NI 


| 
| 
> 
- 
1 
y 
3 


378 BIOMETRICS, SEPTEMBER 1959 


but when selecting a half-sibship it is 


(1 + n)th 
Vn + (n — + 13h? 


TO, 


which may be written to compare with the progeny test situation 


1 (1 + n)*th? 
Nn + (n — Din + 


The expected response to selection in half-sib selection is often written 
as AG = rh,.o,,, Where h,, is the square root of the heritability of a 
family average and oa,,, its genetic standard deviation. It must be 
remembered that o;,, is the genetic variance of the half-sibship and 


obtains a contribution from the dams; it is o? for sires but jo;(m + 1)/n 
for a half-sibship. h’, in the half-sib case is 


(1 + n)3h? 
1 + (n — 1)zh* + (n — 1)(h*/4n) 


If all groups in a population of half-sib groups are tested it is obvious 
that the best result comes from a population in which the fraction saved 
for breeding is exactly one sib group; for if S animals out of a total N 
are required for replacement, the proportion selected is the same if we 
take one group out of N/S groups or two out of 2N/S and so on, but 
the smaller the number of groups the greater the number per group 
that can be scored out of a fixed number scored and so the greater the 
accuracy of the estimate of their genetic merit. This ignores any differ- 
ence there may be in selection differential between 1 out of x and 2 out 
of 2x. In practice more than one group will be used to avoid inbreeding 
and gene fixation. It is safe to say that, when all groups are tested, 
the fewer the groups the better provided there is a sufficient number to 
allow the maximum selection differential. 

It may be thought better on some occasions not to test all groups 
in a population but only those that can be tested adequately with 
facilities available. If N animals can be scored each generation, it 
may be possible to test N groups of which one will be required; this 
gains the maximum selection differential but the test is based on one 
animal per group. The alternative is to test fewer groups more 
thoroughly; this sacrifices selection differential for accuracy. The 
value of 


ral (1 + ny 
(4n/h°) + (n® — 1) 


ly 


OPTIMUM GROUP SIZE 379 


TABLE 1 
SELECTION* 


Number Proportion of Groups Selected (p) 
Tested h? 
(N) 0.01 0.02 0.04 0.05 0.10 0.20 0.40 


(n=10) (n=20) (n=40) (n=50) (n=100) (n=200) (n=400) 
1000 | 0.25) 1.816 1.888 1.8705 1.833 1.645 1.358 0.951 
0.50 | 2.189 2.153 2.021 1.957 1.698 1.386 0.960 
1.00} 2.483 2.323 2.107 2.019 1.732 1.400 0.970 


(n=5) (n=10) (n=20) (n=25) (n=50) (n=100) (n=200) 
500 | 0.25) 1.575 1.646 1.677 1.669 1.558 1.316 0.941 
0.50] 2.002 1.984 1.914 1.875 1.662 1.358 0.960 
1.00 | 2.403 2.251 2.064 1.998 1.715 1.386 0.970 


(n=2) (n=4) (n=8) (n=10) (n=20) (n=40) (n=50) 
200 | 0.25] 1.354 1.855 1.398 1.401 1.365 1.218 0.892 
0.50} 1.842 1.767 1.720 1.689 1.558 1.316 0.941 
1.00 | 2.403 2.178 1.978 1.916 1.680 1.372 0.960 


(n=1) (n=2) (n=4) (n=5) (n=10) (n=20) (n=40) 
100 | 0.25] 1.3835 1.227 1.204 1.215 1.190 1.092 0.844 
0.50} 1.896 1.670 1.570 1.545 1.435 1.246 0.912 
1.00 | 2.670 2.178 1.935 1.854 1.628 1.344 0.951 


2 
*Tabled values of the genetic superiority of selected family. 
(4n/h?) + (n? — 1) 


has been calculated for N = 1000, 500, 200, and 100 at heritabilities 
of 0.25, 0.5, and 1.0, when the number of groups tested is 100 or less 
and the number required is one. For example in Table 1 with 500 
animals tested and a heritability of 0.25, one might be able to test 100 
groups on 5 offspring each and pick one group; or one might test 50 
groups on 10 offspring each and pick one group; the genetic superiority 
of the group so picked is shown in the third column for the first test 
and in the fourth for the second. The superiorities are 1.575 and 1.646 
respectively showing it is better to test fewer groups. In this set of 
examples N is equal to K. 

The results are shown in Tables 1 and 2 and for h? = 0.25 in Figure 
1. It can be seen that when the number tested is of the order of 100 and 
h* is 0.25 it pays to maximise the selection differential. The same is 
true of 200 tested though there is a slight optimum at n = 10. When 
500 are tested it does not pay to increase the selection differential to a 


ae 


2 
| 
—— ~ - $$$ $$$ 
4 
| 
i 
yt 
71 
i 
| 
i 
| 


380 BIOMETRICS, SEPTEMBER 1959 


TABLE 2 
ProGeny TEst1NG* 


Number Proportion of Sires Selected 
Tested h? 
(N) 0.01 0.02 0.04 0.05 0.10 0.20 0.40 


(n=10) (n=20) (n=40) (n=50) (n=100) (n=200) (n=400) 
1000 | 0.25] 1.687 1.830 1834 1807 1.631 1.350 0.935 
0.50} 2.048 2.084 1.984 1.930 1.692 1.376 0.961 
1.00} 2.342 2.225 2.073 2.000 1.724 1.390 0.966 


(n=5) ies 10) A (n=25) (n=50) (n=100) (n=200) 
500 | 0.25) 1.335 1.529 1.625 1.629 1.535 1.305 0.935 
0.50] 1.722 1.856 1.851 1.821 1.640 1.354 0.935 
1.00| 2.112 2.122 2.004 1.947 1.699 1.379 0.962 


(n=2) (n=4) (n=8) (n=10) (n=20) (n=40) (n=50) 
200 (0.25; 0.916 L111 1.268 1.302 1.323 1.194 0.890 
0.50} 1.258 1.459 1.570 1.580 1.507 1.292 0.930 
1.00} 1.687 1.830 1.834 1.807 1.631 1.350 0.953 


(n=1) (n=2) (n=4) (n=5) (n=10) (n=20) (n=40) 
100 | 0.25] 0.668 0.830 0.987 1.030 1.106 1.057 0.892 
0.50 | 0.905 1.140 1.296 1.329 1.342 1.205 0.895 
1.00} 1.348 1.529 1.625 1.629 1.535 1.305 0.935 


*Tabled values of , J = ae , the genetic superiority of selected sire. 
1 + (n — 


point which reduces n to less than 20. When 1000 are tested there is 
very little in it, the optimum being near n = 30. With heritabilities 
of .5 and 1 it always pays within the limits of the table to maximise the 
selection differential. The trends are compared with those used in 
progency testing. As was to be expected, except where n is small, there 
is not much difference between the two calculations. When progeny 
testing, the rate of advance falls off rapidly as n decreases beyond the 
optimum point but in half-sib selection this is not so; in some cases the 
rate of advance continues to increase until n = 1 when half-sib becomes 
full sib selection. 

Where a single group is not large enough to provide all replacements, 
the formulae given here apply as if to a population of test animals equal 
to N/S where S is the number of groups saved for breeding and N is the 
total number of animals tested. I assume throughout that in half-sib 


| 
| 
| 
f 


OPTIMUM GROUP SIZE 


GRouP 


or 


GENETIC SUPERIORITY OF SELE 


64 $12 
NUMBER OF ANIMALS IN A GROUP >t) 


FIG 1 


Curves showing the effect of altering the number of groups tested with a fixed population tested when 
heritability is 0.25 and number tested, N, = 100, 200, 500, and 1000, Progeny test--—-—-—-—, Half- 
sib selection — — — 


selection the dams of the animals used for breeding are also the dams of 
animals used in the sib test as they normally are in poultry breeding, 
and have not considered complications added by mixing full and half- 
sib selection. If n is the number of dams and d the offspring per dam 
and N = nd, 


+ 2d) 
VN + +d) —d—n] 


= 


REFERENCES 
Robertson, A. [1955]. Prediction equations in quantitative genetics. Biometrics 11: 
95-8. 


Robertson, A. [1957]. Optimum group size in progeny testing and family selection. 
Biometrics 13: 442-50. 


nae 
“a” ~ Se 
AL 
© 
~~ N 
10 } 
Ne1000 
00 
N:200 
BE 
| 
\ 
| 
iar 


PAIR COMPARISON, WITH AND WITHOUT TIES' 


N. T. GrmpGEMAN 


Division of Applied Biology, National Research Council 
Ottawa, Canada 


SUMMARY 


A probabilistic model for pair comparison, in various experimental 
settings, and with and without admission of ties, is described. When 
discrimination is the objective, admission of tied decisions theoretically 
increases the power of the test of the null hypothesis, but in practice 
this may be offset by a decrease in the subject’s efficiency of decision, 
and in these circumstances it is better to prohibit ties. When preference 
is the objective, ties should be admitted as they add information. 


INTRODUCTION 


- In pair comparison, by which is meant a decision as to which of two 
items has a given attribute (not necessarily preference), an intermediate 
response class is created if the answer ‘“‘Can’t decide” or ‘‘Neither” or 
some such tie is permitted. Ties transform a binomial system into a 
2-parameter trinomial one. We shall discuss a probabilistic model that 
will allow for ties and that will accommodate all relevant testing situ- 
ations. A broad division of tests can be drawn up: one type concerned 
with variable response in a single subject, and the other with the dis- 
tribution of (mostly) constant responses in a population of subjects. In 
the former, which is of special interest to psychophysicists, we hypoth- 
esize a stochastic neurological process that links the sensation with 
the response; it involves empirical probabilities, and is of course confined 
to marginal differences. In the latter, which is the domain of consumer- 
preference surveyors, there is a real population and hence a frequentially 
definable probability. 


1Contribution from the Division of Applied Biology, National Research Laboratories, Ottawa, 
Canada. Issued as N.R.C. Report No. 5297. 


382 


| 


PAIR COMPARISON 383 


THE MODEL 


The symbols A and B will be used for the compared items themselves 
and also for the corresponding choices. <A tie will be symbolized by 0. 
A full model for the act of comparison needs five parameters, although 
in most experiments some will be known a priori. They are defined 
below. The sampled population may be cither an infinite number of 
trials on one subject, or a specified universe of subjects. We have the 
probabilities: 


p. = P (discrimination of the difference), 

pa = P (deliberate choice of A, after discrimination), 

p, = P (deliberate choice of B, after discrimination), 

p, = P (arbitrary choice of A or B, after discrimination), and 
p, = P (indiscriminate guessing), 


and the setting of these parameters is shown in Table 1, which is in 
fact the model. 

Some special cases of practical interest may be noted. Consider, 
first of all, discrimination tests on two similar stimuli of different in- 
tensities (the typical question would be, “Which is stronger?”). We 
now assume that if discrimination is achieved, the recognition of A and 
B is certain. Letting A be the stronger, we put p, = 1 and p, = 0, 
and obtain the 2-parameter binomial system, 


Pr {A} = p, + p,(1 — p,)/2 
Pr {0} = (1 — — p.) (1) 
Pr {B} = p,(1 — p,)/2. 

But if ties are not allowed (which implies p, = 1 because all non- 


discriminated trials must be guessed at), we have the 1-parameter 
binomial system, 


Pr {A} 
Pr {B} 


(1 + p,)/2 
(i — 


Secondly, consider preference trials in which the items are obviously 
different, so that p, = 1. This yields the 3-parameter trinomial system, 


Pr {A} =p. + (1 — — 
Pr {O} = (1 —p. — m1 — p,) (3) 
Pr {B} =» + (1 — — pp. /2. 


And if ties are prohibited (which means p, = 1), we have 


(2) 


ll 


\ 
fe 
| 
Bo 
| 
= 
} 
4 


S, SEPTEMBER 1959 


BIOMETRIC 


— — 1) 


+ *d) 


— — 1) 


‘did 


(‘d — 1)?d — 1) 


‘ded — — 1) 


aoroya 
AIBIYIGIE PUB 


PUB 


pus 


4d 


{0} 4d 


{VW} dd 


NOSIUVdNOD NI g ANV 40 HO AO AO SAILITIAVaOU JO NOILISOdNO/) 


T 


384 
ll 
W 
Nn + | N 
i | 
| 
i lin 
4 


PAIR. COMPARISON 


Pr {A} =(l+p. 
Pr {B} = (1 — p, + p,)/2. 


In consumer testing the estimation of p, and p, may be important. 
This is clearly impossible with (4), but the trinomial situation repre- 
sented by (3) does so lend itself—if replications are made. With k 
trials per subject each result’ must fall into one of 3* classes. The 
statistics of experiments of this kind have been worked out by Ferris 
[3] with particular reference to k = 2, which yields reasonably precise 
estimates of the two parameters. 


(4) 


HYPOTHESIS TESTING AND POWER 


A test of Hy{p, = 0} or of Ho{p. = ps} is often required. We shall 
discuss the former only, but the argument is readily extensible to the 
latter. Now from both sets of equations (1) and (2) we have that 


= Pr {A} — Pr {B}. (5) 
Furthermore, it is easily shown that 
var (p,) = (p, + p.)(1 — p.)/N (6) 


where N is the number of trials. As the variance of #, monotonically 
decreases with p, , it is in general to be expected that admission of ties 
will enhance power—unless offset by a diminution of p, . In other 
words it is probably best to say to the subject “If you can’t decide, say 
so; don’t guess,”’ and then to leave all the ties (say 7’ in number) out 
of consideration and to treat the remaining N — T results binomially. 

For the analogous sign test, which is a non-subjective pair com- 
parison, Hemelrijk [5] has proved that the leaving of ties out of con- 
sideration makes for a more powerful test of the null hypothesis than if 
the ties are equally distributed between A and B. Here ties are given; 
there is no question of their prohibition. In ordinary, subjective pair 
comparison, prohibition of ties means in effect that they are randomly 
distributed between A and B. 


SOME PRACTICALITIES 


Although admission of ties should in general increase test power, 
the characteristics of the power function are such that, when N is not 
large, exceptions will occur. To exemplify this the powers for p, = 0.5 
and N = 6, 10, and 16 have been computed for p, = 1 (no ties allowed) 
and p, = 0.8 and 0.6. This was done by computation of the trinomial 
probabilities of all sample points in the binomially specified critical 
regions. The results are assembled in Table 2. (It is assumed that the 


385 
| 
4 
E 
3 
| 
| 
| 
| 
| 
ry 
ame 
‘ 
ae 
| 


386 BIOMETRICS, SEPTEMBER 1959 


dichotomy R, : R, , where R denotes the number of successes in N trials, 
is the basis of the test of significance.) 

Beyond purely statistical considerations we have to reckon with the 
possibility that a change of test design will affect a subject’s sensitivity, 
ie., the size of p, . It is therefore wise to assess the relative merits of 
designs empirically. To this end the following experiment was con- 
ducted: 


From 3 aqueous solutions of sucrose, X (1%), Y (1.02%), and 
Z (1.04%), two pair comparisons were drawn, viz., X versus Y, 
and X versus Z. Each was offered in replicate to 2 subjects, 
once with the question, ‘“Which is sweeter?”’, and again with, 
“Which, if either, is sweeter?’ Tastings were done in blocks 
of 20 pairs, under code and randomized with the restriction 
that left- and right-hand orderings of the presentation were 
balanced. 


The results are shown in Table 3. The null hypothesis is of course 
that sensory discrimination of the sweetness contrast is zero, so that 
in comparing the two test designs we must equate the smaller probability 
of acceptance of Hy with the more successful design. In the present 
instance the differences between the contrasted probabilities are too 
small and too variable to allow for a clear-cut choice, although there is 
a slight tendency towards a smaller P in the no-tie trials. If the results 
are pooled over subjects and contrasts, we find, 


p. (binomial) = 0.425 + 0.072 
p. (trinomial) = 0.294 + 0.067 


which suggests that the trinomial test (ties allowed) depresses p,. All 
in all, the simpler binomial design seems here to be superior. 


PREFERENCE ‘TESTING 


In preference work the admission of ties is usually valuable; it adds 
to the information content of the experiment. Ties form a neutral 
class that has some of the properties of a zero interval on a hedonic 
continuum, and the quantities Pr{ A}, Pr{O}, and Pr{B} can be regarded 
as incomplete integrals of a near-normal p.d.f. whose abscissa will 
‘serve as a preference scale [9]. It is, however, to be noted that the 
binomial no-tie set-up is mandatory in the kind of multi-pair-comparison 
preference ordering first described by Thurstone [8], (see also Mosteller 
[6], Bradley and Terry [2], and Gridgeman [4]), because the significance 
and goodness-of-fit tests are based on the binomial distribution. 


d 


PAIR COMPARISON 387 


TABLE 2 


Powers oF SomE N-piicaTeD Park CoMPARISONS, WiTH AND WITHOUT 
Tres, WHEN THE PROBABILITY OF SENSORY DISCRIMINATION IS 
ConsTANT (AT ps = 05) 


No Ties Ties Allowed 
N 1 0.8 0.6 
a 
6 0.05 0.178 0.219 0.215 
0.01 — — 
10 0.05 0.244 0.351 0.417 
0.01 0.056 0.104 0.159 
16 0.05 0.630 0.621 0.687 
0.01 0.197 0.298 0.349 


a = confidence coefficient. pg = probability of guessing the identity of the items when they are 
not discriminated; it is constrained to unity when ties are prohibited. 


TABLE 3 
FREQUENCY OF CorrEcT(R,), INcoRRECT(R,), AND TrED (Ryo) DEcIsIONSIN(N = 20) 
Park CoMPARISONS OF TASTE INTENSITIES, AND PROBABILITY (P) OF 
ACCEPTABILITY OF THE NULL HyporuEsis THAT THE ITEMS ARE 
SENSORILY INDISTINGUISHABLE* 


Binomial Trinomial 
Subject Comparison | Block 
Ra R, Ro R P 
PC. 1 13 7 | 12 4 5 | 0.105 
e 2 8 12 | 0.868; 8 1 11] 0.820 
ae. 1 16 8 4 8 | 0.598 
2 11 9| 0.412] 8 5 7 | 0.500 
P.C X:Z 3 16 4 | 0.006 | 11 4 5 | 0.105 
si " 4 15 5 | 0.021 9 9 2 | 0.033 
J.G 3 17 3} 0.001 | 15 3 2 | 0.001 
- ee 4 18 2 | 0.000 | 18 1 1 | 0.000 
Sums | 114 46 88 31 41 


*The comparisons were of sweetness intensities, with Y : ¥ a 2% difference and XY :Z a 4% 
difference. 


| 
J 
4 
al 
Re 
= 
cet 
: 
" 
He 
| 
: 
| | 
| 
ab 
V4 
at 


388 BIOMETRICS, SEPTEMBER 1959 


A detail in the literature on multinomial pair comparison for prefer- 
ences may be mentioned here. Scheffé [7], illustrating his analysis-of- 
variance technique for graded pair comparison, uses a 7-point preference 
scale with ties allowed, and he comments on a “scarcity of zeros.” A 
similar observation is made by Bliss [1] on some data of his own taken 
on a 7-point Scheffé scale. The meaning is that the subjects, in the 
course of the n(n — 1) pair comparisons (half A:B and half B:A) on n 
test items, built up frequency histograms whose mid-classes were lower 
than the adjacent classes. But a consideration of the theoretical 
composition of the histograms will show that a dip at the zero point is 
normally to be expected. A peak, on the other hand, would imply a 
glut of ties, i.e. a tendency to shirk decision. In a comparatively long 
scale there is something to be said for the exclusion of ties; the scale 
points can be adjusted accordingly, and decision cannot be shirked. 


REFERENCES 


{1] Bliss, C. I., Greenwood, M. L., and White, E. S. [1956]. A rankit method of 
paired comparisons for measuring the effect of sprays on flavor. Biometrics 12, 
381-403. 

[2] Bradley, R. A., and Terry, M. E. [1952]. The rank analysis of incomplete block 
designs. I. The method of paired comparison. Biometrika 39, 324-45. 

(3] Ferris, G. E. [1958]. The k-visit method of consumer testing. Biometrics 14, 
39-49. 

[4] Gridgeman, N. T. [1955]. The Bradley-Terry probability model and preference 
tasting. Biometrics 11, 335-43. 

(5) Hemelrijk, J. [1952]. A theorem on the sign test when ties are present. Proc. 
Royal Acad. Amsterdam (Series A) 55, 322-26. 

[6] Mosteller, F. [1951]. Remarks on the method of paired comparisons. Psycho- 
metrika 16, 3-9;203-6; 207-18. 

[7] Scheffé, H. [1952]. An analysis of variance for paired comparisons. J. Amer. 
Stat. Assoc. 47, 381-400. 

[8] Thurstone, L. L. [1927]. Psychophysical analysis. Amer. J. Psychol. 38, 368-89. 

[9] Thurstone, L. L. [1927]. A law of comparative judgment. Psychol. Rev. 34, 
273-86. 


— 
a 
Ve: 
My: 
~ 
ty 


A METHOD FOR TESTING TREATMENT EFFECTS IN THE 
PRESENCE OF LEARNING 


Srymour GEISSER 


National Institute of Mental Health 
Bethesda, Maryland, U.S.A. 


1. INTRODUCTION 


In order to test whether treatments differ, an experimenter often 
will assign experimental units at random to the various treatments. He 
will then measure the responses and in order to test whether the treat- 
ments differ, he analyzes his data by the one-way analysis of variance. 
In some studies, experimental units are difficult to obtain for one reason 
or another and the experimenter gives all the treatments to each unit 
one at a time. If the response of every treatment given to each unit is 
independent of the response of every other treatment, and the variation 
of the response is the same for all the treatments, the analysis for treat- 
ment differences consists of the mixed model analysis. If, on the other 
hand, the assumptions of independence and homoscedasticity are 
violated, the statistician has recourse to an analysis which depends on 
multivariate techniques (see Scheffé [5] or Graybill [3]). However, 
quite often the response is measured in terms of performance to a given 
task and when the experimental units are humans, not infrequently 
the individuals tend to improve with repetition of the task so that 
estimates of the treatment effects may become confounded with the 
factor of learning unless the experiment has been properly designed. 
An experimental design will be presented for this case which is essentially 
a latin square arrangement and which can be analyzed by multi- 
variate methods. 

Consider the following experimental setup: np individuals from the 
population under investigation are assigned at random to p groups, 
n toa group. The p groups are given p treatments on p different days, 
in a latin square arrangement. Hence, every individual in a particular 
group is given the same treatment on any given day, no two groups 
receive the same treatment on any given day and every group receives 
each treatment. 


389 


| 
Te 
Mee 
| 
Bro 
4 
74) 
| 
3{ 
~ 
| 
is 
| 
| a 
1 
| 
| 


390 BIOMETRICS, SEPTEMBER. 1959 


To test the hypothesis that the treatment responses are different, 
it is inappropriate to use the usual analysis of variance procedures 
since there is certainly dependence among the treatment responses and 
perhaps differences in variation among the different treatment responses. 
A method of analysis based on techniques discussed by Hsu [4], Scheffé 
[5], and T. W. Anderson [1], which adapted Hotelling’s T’ statistic to 
similar problems, will be presented which provides an exact test for the 
hypothesis of differences among the treatment responses. 


2. THE METHOD 
Since we have given every treatment to each unit or individual, we 
can consider an individual as a multivariate observation where 
Xia » Breads 


a = 1,---,n,j = 1,-:-, p, is the vector representation of the ath 

individual in the jth group with the indices 1, --- , p representing the 

p treatments (variables). Let x/, have a multivariate normal dis- 

tribution with mean wi = (uj , --* , Hyp) and variance-covariance 

matrix V;. Now V; = V for all j if the underlying model is additive. 
If the latin square arrangement were 


G, D, D, D, 
G. D, D, 


where 7’; is the ith treatment, G; is the jth group, and D, is the kth day, 
then for the additive model 


+ d,) 
us 


(t, + d, t, + d, + d,-s + d,, 2) 
= + d,, t, + d, + d,_» a + d,-1) 


where /, is the effect of the ith treatment, d, is the effect of the kth day. 


re 
| 
4 
+ 
: 


TESTING TREATMENT EFFECTS 391 


Although we are treating here a particular latin square, the results that 
follow are true for any latin square arrangement. 
Apply a transformation to x;,, so that, 


Via = Cx;, 


where C is any p — 1 by p matrix such that every row sums to zero 
(see discussion of this transformation in Anderson [1] pp. 111-12). 
One such matrix is 


p-1 -1 -1 =-1 
p-i -1 -1 -1 
C=)" |. 
This gives us a new vector for each individual which has p — 1 elements 
= (Yiat *** » Yra-1)- Hence y;, has variance-covariance 


matrix CVC’ and expectation Cu; . Let 


be the mean vector of the jth group. Then y/. has variance-covariance 
matrix n™' CVC’ and expectation Cu; . Let 


be the vector of the means of all the adjusted treatments for all pn 
individuals. Hence y.. has variance-covariance matrix (pn)~'CVC’ 
and expectation 


p'C Diu; = Cy 


where p’ = (¢, + d, + d,---,t, +d) andd = >-?_,d,. Further 
since every row of C sums to zero, Cu is independent of d, and for the 
particular C previously presented (Cu)’ = (¢, — i,t, — i, --- , t,-. — 8) 
where / is the average of the treatment effects. Now to test the hypoth- 
esis that 4, = 4, = --- = #, is equivalent to testing the hypothesis 
that?; -t=0. Anestimate of CVC’ then is 
got by computing the sample variance-covariance matrix for cach of 
the p groups on the transformed variables and then pooling the p 
matrices. Now the statistic 


T? = pny!.S''y.. , 
where S is the pooled variance-covariance matrix, has the 7” distri- 
bution with p(n — 1) degrees of freedom and 


| | 
ge 
2 
Be 
Bik 
a 
| 
ect 
| 
y 
1. 


392 BIOMETRICS, SEPTEMBER 1959 


— 2)p + 2] 

p(p — — 1) 
where F has the usual F distribution with p — 1 and (n — 2) p + 2 
degrees of freedom. This statistic provides a test of the hypothesis 
that the treatment effects are different. 


3. EXAMPLE 


An experimenter wished to test for differences among three drugs 
with regard to a particular performance test about which he suspected 
that there was a practice or learning effect. He obtained a random 
sample of twelve individuals from the population under investigation 
and assigned at random four individuals to each of three groups. He 
chose the following latin square arrangement. 


= Fl[p — 1, (n — 2)p + 2], 


Drug z Drug y Drug z 


Group 1 Day 1 Day 2 Day 3 


Group 2 Day 2 Day 3 Day 1 


Group 3 Day 3 Day 1 Day 2 


The drugs are given on three days far enough apart so that there is 
no carry-over drug effect. The task is performed shortly after the 
administration of the drug. The data that the investigator collected 
in accordance with the previous design are as follows: 


D:z Dy D, D D.—D D,-D 
109 118 88 105.0 4.0 13.0 
Group 1 114 111 109 111.3 2.7 —.3 
163 146 139 149.3 13.7 —3.3 
129 132 121 127.3 Ee 4.7 
92 93 79 88.0 5.0 
Group 2 108 125 106 113.0 -—5.0 12.0 
150 163 128 147.0 3.0 16.0 
142 141 125 136.0 6.0 5.0 
117 133 108 119.3 —2.3 13.7 
Group 3 151 152 134 145.7 5.3 6.3 
151 154 137 147.3 3.7 6.7 
122 121 111 118.0 4.0. 3.0 


fol 


pe 
ex] 


a 
h 
a 
a 
fr 
m 
= — tr 
he 
an 
of 
é 


TESTING TREATMENT EFFECTS 393 


We now compute the sample variance-covariance matrix for each 
of the three groups: 


s,-| 30.59 31.31], 
—31.31 50.79] 
-| 23.33 —12.67 | 
> L-12.67 29.67)’ 
and 
-| 11.48 —13.59 | 
L-13.59 20.25" 


The pooled variance-covariance matrix is S = 3 (S, + S, + Ss) 


where 
s-| 21.80 
~ L-19.19 33.57 
and 
| 0923 po 
= .0600 
The vector mean is y!. = (3.40, 6.82) and 


( 
= 12(3.40, 6.82)( 0028 = 75.7. 


0528 .0600/\6.82 


Transforming to F(2, 8), we get F(2, 8) = 8/18 T° = 33.6 which is 
highly significant. 

It is interesting to note here that if the experiment had been stopped 
after the first day the drugs could have been compared by a one-way 
analysis of variance on the assumption of equal variances. This would 
have led to an F(2, 9) ratio, the value of this ratio was 2.11 and was far 
from being significant at the .05 level. The increased sensitivity of the 
method stems from the fact that the correlations between the un- 
transformed treatments were positive. If one or more of the correlations 
had been large and negative it might have been possible for the one-way 
analysis of variance to have been more sensitive. However, in problems 
of the type discussed here the correlations are usually positive. 


4. YOUDEN TYPE SQUARES 


In the example given in the previous section, suppose that the data 
for the third day were lost or that for some reason or another the ex- 
periment had to be terminated at the end of the second day. Then the 
experimental setup might be as follows: 


4 
fy 
Hus. 
3 
| 
\ 
| 
~ 
| 
| | 
4 


394 BIOMETRICS, SEPTEMBER 1959 


Drug x Drug y Drug z 
Group 1 Day 1 Day 2 No Data 
Group 2 Day 2 No Data Day 1 
Group 3 No Data Day 1 Day 2 


This then is a Youden square arrangement and under the additive 
model of Section 2, the transformed mean vector should still have 
expectation zero for each of its elements. Furthermore, one may still 
get estimates of the variance-covariance matrix of D, , D, , D,.. How- 
ever the variances are estimated with 2n — 2 degrees of freedom while 
the covariances are estimated only with n — 1 degrees of freedom. These 
estimates can be further combined to give estimates of the variance- 
covariance matrix of D, — D and D, — D. Hence we may still form a 
pseudo 7” statistic, whose distribution unfortunately is not known. 
We may transform this pseudo 7” into a pseudo F as before and for 
large n the pseudo F will be approximately distributed like } x’(2) 
where x°(2) is a chi-square variate with 2 degrees of freedam. 

In general a Youden square arrangement for p treatments will lead 
to a pseudo F statistic such that the pseudo F will be approximately 
distributed like (p — 1)~'x*(p — 1) for large n. 


5. ALTERNATIVE MODELS 


We assume earlier that we were dealing with an additive model. 
Under this assumption the variance-covariance matrix V; of each of 
the p groups was equal to V. If the true model is either partially 
multiplicative and partially additive (i.e. the treatment effects are 
additive and the day effects are multiplicative or vice versa) or wholly 
multiplicative (i.e. both the treatment effects and the day effects are 
multiplicative) then V; ~* V. Hence in order to test for additivity 
against either alternative, it is equivalent to test the homogeneity of 
the p sample variance-covariance matrices. A test for the homogeneity 
of variance-covariance matrices can be found in Box [2]. If we reject 
the assumption of additivity in favor of one of the others, the 7” statistic 
no longer is distributed like Hotelling’s 7”. However it still appears to 
be a reasonable statistic to use in that the pooled sample variance- 
covariance matrix of the transformed variables is still an estimate of the 
variance-covariance matrix of the sample mean vector of the trans- 
formed variables. The expected value of every element of this vector 


| | 
a 
f 
4 
4 


TESTING TREATMENT EFFECTS 395 
is still zero even under the alternative models. Although the sampling 
distribution of 7° in the alternative cases is probably impossible to 
find in explicit terms, 7°[(n — 1) p + 2] p(x — 1) is approximately 
distributed like x* (p — 1) for large n. The small sample testing situa- 
tion is equivalent to the multivariate Fisher-Behrens problem of which 
one possible solution is put forth in Anderson ({1| pp. 118-22). 


6. FURTHER REMARKS 


It is of interest to point out that to a somewhat different type of 
problem using another model, Sweeny in an unpublished report [6] 
also applied a 7” test to a latin square design. I am indebted to R. A. 
Bradley, Editor of Biometrics who was familiar with this report, for 
having brought it to my attention. 


REFERENCES 


{1] Anderson, T. W. [1958]. Introduction to Multivariate Statistical Analysis. New 
York: Wiley. 

{2] Box, G. E. P. [1949]. A general distribution theory for a class of likelihood 
criteria. Biometrika 36, 317-46. 

(3] Graybill, F. [1954]. Variance heterogeneity in a randomized block design. 
Biometrics 4, 516-20. 

(4) Hsu, P. L. [1938]. Notes on Hotelling’s- generalized 72. Ann. Math. Stat. 9, 
231-43. 

[5] Scheffé, H. [1956]. A “mixed model” for the analysis of variance. Ann. Wath. 
Stat. 27, 23-36. 

[6] Sweeny, H. [1955]. T'ests of homogeneity for experimental designs where errors are 
correlated and have heterogeneous variances. Progress Report No. 8 to the 
Quartermaster Research and Development Command by the Virginia Poly- 

technic Institute, Blacksburg, Virginia. 


| | 
| 
| 
= 
4, 
og 
} 
Wi 
a 
| 
ag 
| 


ON THE DEVELOPMENT OF CLINICAL STATISTICAL 
SYSTEMS FOR PSYCHIATRY 


J. B. Cuassan 


St. Elizabeths Hospital 
Washington, D. C., U.S.A. 


The essential purposes of this presentation are to state briefly some 
of the basic properties of the data of clinical psychiatry and the impli- 
cations that these properties have for the application of statistical 
methodology and inference to such data. 

One of the most important properties of data relevant to the study 
of psychopathological processes is that of a general underlying vari- 
ability within the individual patient-state from one week to the next, 
or from one day to the next, or even from moment to moment, with 
respect to a large number of variables which are considered to be im- 
portant in the description and characterization of mental illness. The 
realization of such variability as a fundamental property of psychi- 
atric data leads to the conclusion that an observation of a patient-state 
at a single point in time or with respect to a short time interval cannot 
in general be expected to provide anywhere near as precise or statis- 
tically definitive a description of the patient-state as can be obtained 
from repeated observations of the same patient taken at relatively 
close intervals and over a period of sufficient duration. Thus if X repre- 
sents a vector whose respective components are descriptive of various 
aspects of the patient-state as observed at a single point in time, then 
we are interested in estimating the distribution of X for each patient 
over an adequate period of time rather than in obtaining a single read- 
ing of X. This is with respect to obtaining an adequate description 
of the patient-state per se in statistical-clinical terms even before one 
uses such a description within a clinical experimental design such as 
might be used in the evaluation of two or more tranquilizers. 

The description of patient-states in terms of probabilities estimated 
from sequences of momentary or short-term states will in general in- 
clude transition probabilities, i.e., probabilities that are dependent on 
the previous state. This concept of a stochastic definition of the pa- 


a 


CLINICAL STATISTICAL SYSTEMS _ 397 


tient-state is entirely consistent with Barankin’s [1956] view that ‘“‘per- 
sonality processes and stochastic processes are one and the same kind 
of thing.” The particular point that I am stressing in this connection 
is that experimental designs in clinical psychology and psychiatry must 
in general take into account the stochastic properties of personality 
and psychopathology if they are to be meaningful or at all precise. 

This general approach to the application of statistics to clinical 
psychiatry and psychology is also in close agreement with the views 
expressed by Bush and Mosteller [1955] concerning the application of 
stochastic models to learning theory. The following comments of Bush 
and Mosteller, quoted from “Stochastic Models for Learning’’ are 
particularly relevant: 

“Data on animal and human learning present peculiar problems to 
the statistician; since irreversible changes take place while the data 
are being collected, repeated sampling is seldom possible. Organisms 
that can be considered ‘identical’ at the start of an experiment do not 
remain completely ‘identical’ because each has a different history dur- 
ing the course of the experiment. Observations such as these often 
throw doubts on the routine application of standard statistical pro- 
cedures. More important, they suggest that, if methods specifically 
designed for handling these data were available, considerable gains in 
efficiency and meaningfulness would obtain ... .” 

Further, Bush and Mosteller believe that behavior is intrinsically 
probabilistic, although such an assumption is not necessary for the 
application of stochastic models, for “whether behavior is statistical 
by its very nature or whether it appears to be so because of uncon- 
trolled or uncontrollable conditions does not really matter to us. In 
either case we would hold that a probability model is appropriate for 
describing a variety of experimental results presently available.” 

In addition to providing challenging theoretical problems in the 
construction of appropriate stochastic models and in the handling of 
associated problems of statistical inference, the frequency of observa- 
tion about each patient which is required in the use of stochastic models 
has the very important advantage of providing meaningful data to 
the clinician on an ongoing basis as opposed to providing him with 
results based on mere endpoint observations. The clinician knows he 
is dealing with process. He cannot help but remain unimpressed with 
statistical procedures and results which are applied to observations 
made at comparatively isolated points in time, and which do not tell 
him something of what has been happening along the way. This is 
not only true of the application of statistics to the study of the process 
of individual psychotherapy, but also in the area of clinical psycho- 


| 
{At 
> 
He 
| 
| 
| 
Pome 


398 BIOMETRICS, SEPTEMBER 1959 


pharmacology in which one deals with such questions as those involved 
in the determination of optimal dosage level, the risk of side reactions, 
and other questions pertinent to decisions regarding the administration 
of ataractic drug therapy. 

Clinical statistical systems can provide bases for the organization 
and systematic recording of clinical experience itself or at least of var- 
ious of its aspects. Observations periodically noted in this connection 
could be quite similar to those such as might ordinarily and routinely 
be made with respect to patients at periodic intervals, but which in the 
absence of a systematic procedure for their notation often become 
quantities of lost data or heavily biased and vague recollections. The 
process of noting clinical observations systematically also allows for 
their use either subsequently or on an ongoing basis in estimating 
probabilities and risks, and testing hypotheses within a regular frame- 
work of patient observation as well as within specially designed studies. * 

When the quantity of data with respect to an aspect of clinical 
investigation is insufficient for the estimation of a particular probability 
with a high degree of precision, or for the application of a test of sig- 
nificance, or if the appropriate statistical tools have not yet been de- 
veloped for the particular context of statistical inference, the value of 
a clinical statistical system is that it provides a more complete descrip- 
tion of whatever data has been accumulated and thereby contributes 
to a sounder basis for clinical judgment than if the data had been left 
to fragmentary recollection or impression alone. urthermore, such 
systematic description can be of particular value in clinical evaluation 
and decision in conjunction with pertinent laboratory findings, or other 
relevant data, or in relation to subject-matter theory (See Chassan 
{1959]). 

The use of statistics for these purposes is certainly of greater value 
and justifiability than, say, is the publication of the results of tests of 
significance in instances where the obvious lack of statistical inde- 
pendence results in entirely fictitious levels of significance. The essen- 
tial point, however, is that one need not forego the goal of precise valid 
statistical inference for the sake of developing useful data systems on 
an ongoing basis in connection with the uses which have been mentioned 
above. There is no reason why both of these functions cannot be 
developed together. 

An essential characteristic of many of the important outcome 
variables in the field of application under discussion is that of their 


*Although the main emphasis here is on the application of stochastic models to the detailed study 
of clinical phenomena, the possible value of the application of stochastic models to such broad data 
as those of incidence and prevalence is also worth noting; in this connection see Marshall and Goldhamer 
{1955}. 


: 
A 
4 
] 
1 
0 
d 
i ll 
j 


_ 


CLINICAL STATISTICAL SYSTEMS 399 


statistical dependence with respect to the conditions under which many 
studies are performed. What I have reference to at this point is not 
so much dependence on previous states (which can be handled at least 
in theory by the use of appropriate Markov models) but rather to 
dependence resulting from personal interaction between patients in a 
given study. This is quite a serious problem from the point of view 
of statistical inference in clinical trials in which all patients in a study 
out of medical or administrative necessity, or both, have to be on the 
same ward. The problem of inference in the face of this kind of statis- 
tical dependence appears to be hopelessly insoluble, if in addition, 
only endpoint observations are made with respect to each patient. If 
a study has to be confined to a single ward, the use of frequent ongoing 
clinical observations about each patient, (which, incidentally, is an 
essential property of what I am referring to as a clinical statistical 
system) apart from its basic function of providing meaningful clinical 
descriptions and stochastic definitions of the patient-state, may also 
be of value in measuring the degree of statistical dependence of obser- 
vations between patients insofar as day-to-day or week-to-week ob- 
servations of the patient-state are concerned. 

Although one justifiably thinks of statistical dependence as a sort 
of nuisance in studies such as e.g., those which seek to compare the 
relative effectiveness of two tranquilizers, the particular variables which 
are dependent as a consequence of interaction between patients ought 
also to be studied in relation to such dependence. Consequently, 
statistical dependence in the data of clinical psychiatry requires care- 
ful attention for two reasons. First, to avoid what seems to be the 
fairly common malpractice of treating the data as though the obser- 
vations were statistically independent with regard to significance test- 
ing, and so forth, and second, statistical dependence of the observations 
is in general an important phenomenon, and deserves to be studied as 
part of the essential process of the psychopathology—that is, it is 
important to know the conditions and degree to which changes in one 
patient appear to extend to other patients in the same setting. 

A particular difficulty which is encountered in the attempt to apply 
statistical methodology to clinical research is the apparent high order 
of dimensionality of the events which are considered to be of relevance 
in the study of psychopathology. There is, in fact, so ramified an 
extent of phenomena which are considered to have a possible bearing 
one way or another on the description of mental illness that with virtu- 
ally little effort one can wind up with far more variables than available 
degrees of freedom. In addition to this technical statistical difficulty 
in the usage of all possibly relevant data, there is also a difficulty on a 


; 
| 
- 
ar 
. 
| 
n a an 
d 
ie @ 
ir 
Poke 
dy 
ita 
1cr 
} 
i 


400 BIOMETRICS, SEPTEMBER 1959 


purely observational or data collecting level. That is, there are both 
mental and physical limitations to the number of components of the 
hypothetical vector of all possibly relevant components which can be 
observed within the definition of an event. This, of course, is the well- 
known phenomenon or principle of complementarity—to borrow the 
usage of that term from quantum mechanics, and is of particular im- 
portance if one accepts the validity of the study of psychopathological 
processes within the framework of stochastic models and the consequent 
requirement of frequent observations of the patient-state. 

Thus the acceptance of a probabilistic approach to the definition 
of psychopathological processes requires the abandonment of over- 
elaborate readings of what might be called the momentary patient- 
state in favor of the selection of a smaller number of more carefully 
defined variables which can be reported on with a greater frequency, 
and which can be studied by the use of appropriate stochastic models. 
In practice this may not be a real loss even in the description of the 
momentary state, for, the larger the number of items that are reported 
on an examination form or set of forms, the less likely is it that the 
items in general will be defined with great clarity or with a high degree 
of inter-rater reliability. Moreover, various sets of the separate items 
are not actually separate dimensions in any fundamental sense, but 
are used as possibly different aspects or “readings” of the same under- 
lying dimension. Apart from questions which have been raised con- 
cerning the use of parametric techniques in the face of rather arbitrarily 
defined numerical variate-values, the use of factor analysis in handling 
this kind of data for the purpose of establishing dimensionality, in the 
light of the present approach, is also subject to the criticism of ignoring 
the probabilistic aspects of the individual patient-state. To incorporate 
the latter into factor analytic techniques would require heroic com- 
putational efforts which in the light of other criticisms of the application 
of factor analytic techniques to the study of clinical psychopathology, 
does not seem to warrant the effort. 

Of perhaps even more serious consequence from the point of view 
of the clinician, in the actual application to the clinical trial or experi- 
mental design is the matter of the clinical interpretation of the actual 
content of the mass of hyperdimensionalized readings to which such 
techniques have often been applied. It has become apparent that one 
of the major difficulties in the application of psychological scaling and 
testing to the subject matter of clinical psychiatry pertains to the 
question of validity—that is, as matters now stand, to the question of 
the agreement of the results of these procedures with the less formal 
observations of the clinician, and the acceptance by the clinician of 


( 
] 
I 


« 
( 
it 
U 
a 
t 
a 
j 


CLINICAL STATISTICAL SYSTEMS 401 


such results as providing valid knowledge. That is, the usefulness of 
any statistical approach, including the selection of particular variables 
and the number of them, designed for the study of psychopathology 
is limited in practice to the extent of their acceptance by the clinician. 
The position of many clinicians regarding the acceptability of psycho- 
logical testing procedures is reflected in a comment by Nathan S. Kline 
[1957], one of the leading clinical investigators in the field of psycho- 
pharmacology. He views the data obtained by these approaches as a 
“Barmecide feast—an illusion or pretense of plenty.” 

While much of the clinician’s refusal to accept current psychological 
test procedures and the psychologist’s judgments inherent in them 
stems from what the clinician feels is essentially superficial, it is like- 
wise true that the clinician himself often lacks precision in his descrip- 
tion and evaluation of many of the phenomena with which he deals on 
a more direct observational level. While the usefulness of data systems 
presumably designed for the study of psychopathology is limited to 
the extent of their acceptability to the clinician, they are even more 
limited in a fundamental sense as long as clinical observations, de- 
scriptions, and concepts remain vague and imprecise, or lack clear 
definition. 

It is therefore evident that the development of data systems which 
describe behavior and psychopathology directly in clinical terms must 
be accompanied by an increased precision in the use of the language 
which describes the various aspects of mental illness. As this takes 
place the data to be gathered and analyzed will move in the direction 
of face validity, i.e., the kind of validity which stems from measuring 
or studying directly the relevant phenomenon itself, as opposed to 
putting primary emphasis on psychologic test procedures whose data 
must ultimately be re-interpreted in clinical terms. These comments 
are not intended as a general criticism or condemnation of psychological 
testing in clinical research. There can be little doubt that there are in 
current use a number of standard tests and procedures which are very 
useful in clinical evaluation and in research. But again, it is noted that 
the value of any such test for clinical research must in practice depend 
upon how well its results correlate with actual clinical observation. 
Thus it is clear that whether one is interested in extending the value 
and the validity of the psychological tests themselves, or in the more 
direct study and analysis of clinical material, or both, the need for 
the development of data systems at the level of ongoing clinical obser- 
vation and for greater precision in the description of clinical phenomena 
cannot be circumvented. 

The problem of increasing precision in the use of clinical terms is 


125 
3 
b 
| 
1 
| 
| 
tad 
| 
| 
a 
: 
5 
4 
a 


402 BIOMETRICS, SEPTEMBER 1959 


in essence the same as the problem of increasing that aspect of reli- 
ability which deals with the question of inter-rater or inter-observer 
agreement. To a very large extent the problem of increasing precision 
is the problem of the development of operational definitions. To quote 
Bridgman [1945], “One of the greatest advantages of an operational 
breakdown of a situation is that it reduces it to a description of an 
actual happening—of something that has actually been done or that 
has actually occurred—and therefore it has the validity of actual ex- 
perience.” The connection between operational definition and inter- 
observer reliability is also stated clearly in the same paper of Bridgman: 
“A term is defined when the conditions are stated under which I may 
use the term, and when I may infer from the use of the term by my 
neighbor that the same conditions prevailed.” 

Acknowledging the need for operational definitions of clinical terms 
as a basic step toward achieving validity and reliability in the develop- 
ment of clinical statistical systems must also be accompanied by the 
realization that operational definitions can rarely, if ever, be complete. 
I think it is particularly important to keep this in mind in this field 
of application, because the data of psychopathology and interpersonal 
processes in general are so rich and varied in their specific content and 
detail, that the listing of all possible manifestations of a particular 
aspect of psychopathology might require an almost denumerable in- 
finity of words, sentences, and so forth. For this reason the events 
about which probability statements can be made must consist of unions 
of meaningful elementary events rather than of their intersections. 


The question of reliability has thus far been discussed within the 


realm of inter-observer agreement, stressing the need for the oper- 
ational definition of clinical terms. Another kind of reliability with 
which statistical psychologists have been preoccupied is that which is 
defined in terms of the consistency of responses of the same patient on 
repeated tests or examinations. It is worth noting that if we accept 
the stochastic definition of the patient-state as part of the approach to 
its operational definition, then the issue of test-retest reliability be- 
comes somewhat less acute. For, what previously has been interpreted 
as a shortcoming in an examination procedure may now be seen, at 
least in part, as a reflection of the underlying probabilistic aspects of 
the patient-state. 

Finally, I should like to discuss briefly the question of subjectivity 


and objectivity as it exists in many important aspects of the data of — 


clinical psychiatry. In this regard the following comments of Merton 
Gill and Margaret Brenman [1948], two noted psychoanalytic thera- 
pists are particularly pertinent. They state: “Here again our problems 


} 
| 
+ 
4 
b 
| | 
| 
{ | 
4 
Pp 
cl 
| 
ol 
of 
be 
a 
sh 
| is 
4 
| 
| 


CLINICAL STATISTICAL SYSTEMS 403 


are ... different from those of investigators in the physical sciences 
where basic assumptions and their development and modification are 
built rigorously on the basis of specialized rather than everyday ex- 
perience. While the average citizen would feel it a presumption to 
attempt to erect an offhand law of thermodynamics based on his casual 
observations, the same citizen does not hesitate to enunciate final ‘laws 
of human nature’. Often these amateur psychologists make penetrat- 
ing observations and deductions because the materials or raw data of 
psychology surround all of us in our interpersonal relationships, thus 
providing the possibility for the development of intuitive theories in 
a way which would be impossible in physics or chemistry. ... ” 

Interpreting these comments on a level of applied statistics, the 
implication is that the investigator’s relevant knowledge of the subject- 
matter depends to a considerable extent on his experience in life. To 
the extent that no two individuals can be said to have had identical 
experience up to the formulation of an experimental design and the 
interpretation of its results, it can be said that no two investigators 
working independently would select the exact same data system, nor, 
given a data system and the set of possible results, could one expect 
complete agreement with regard to the order of preferred patient-states. 
Thus, as a condition of genuine progress with regard to many of the 
subtler aspects of clinical data, it would appear essential to acknowl- 
edge such interpersonal differences within the framework of any given 
study, to study and analyze distributions of preferences based on 
differences of this kind in conjunction with the study of each set of 
data within any given framework. 

The suggested definition by Professor L. J. Savage [1954] of statistics 
proper as “dealing with vagueness and with interpersonal differences 
in decision situations,” appears particularly pertinent in this context. 

In concluding, I should like to summarize the main points of this 
paper as follows: 


First, the basic statistical approach to the study of the data of 
clinical psychology and psychiatry is through the development and 
application of stochastic models. This requires frequent repeated 
observations on each patient throughout the course of a study as part 
of the operational definition of psychopathology. In practice, this will 
be of immediate descriptive value to the clinician as well as providing 
a motivation for further mathematical statistical developments in 
sharpening the tools of inference in this field of application. 

Second, an essential characteristic of many of the outcome variables 
is their statistical dependence, referring here to between-patient cor- 


| | 
| 
if 
Si 
4 
4 
on \ 
to 
14 
at ¥ | 
f 
ton 
| 
| 


404 BIOMETRICS, SEPTEMBER 1959 


relation in distinction or in addition to previous-state dependence. 
This is a consideration which cannot be ignored, not only from the 
point of view of avoiding errors in statistical inference, but also because 
the phenomenon itself is of particular subject-matter interest. The 
use of stochastic models allowing the repetition of observation would 
seem to be a way of getting at this problem. 

Next, the question of multidimensionality and complementarity in 
the face of the need for frequent observations of the patient-state casts 
further doubt on the usefulness of data systems which reflect a high 
regard for apparent order of dimensionality and a low regard for clini- 
cal content. 

This, in turn, leads to a recognition of the need for striving toward 
operational definitions of clinical terms which will provide data with 
face validity and with greater inter-observer reliability, as well as data 
of more definite meaning to the clinician. 

Finally, a comprehensive approach to the study of interpersonal 
processes and psychopathology and, in particular, with regard to its 
more subtle aspects, requires an appreciation of the existence of a 
priori differences between investigators concerning such matters as the 
order of preference of patient-states and, consequently, the need to 
study such differences systematically. Such a priori differences are 
not essentially the result of differences in rationalistic interpretations, 
but are more appropriately regarded as the result of differences in past 
relevant experience. / 


REFERENCES 


Barankin, Edward W. [1956]. Toward an objectivistic theory of probability. Third 
Berkley Symposium 5, ed. Jerzy Neyman, 21-52. 

Bridgman, P. W. [1945]. Some general principles of operational analysis. Psychol. 
Rev. 58, 246-9. 

Bush, Robert R. and Mosteller, Frederick. [1955]. Stochastic Models for Learning. 
New York: Wiley. 

Chassan, J. B. [1959]. A statistical description of a clinical trial of Promazine. 
Psychiatric Quart. (To be published). 

Gill, Merton and Brenman, Margaret. [1948]. Research in Psychotherapy Round 
Table 1947. Amer. J. Orthopsychiat. 18, 100. 

Kline, Nathan S. [1957]. Criteria for psychiatric improvement. Psychiat. Quart. 
31, 31-40. 

Marshall, A. W. and Goldhamer, H. [1955]. An application of Markov Processes to 
the study of epidemiology of mental disease. J. Amer. Stat. Assoc. 50, 99-129. 

Savage, L. J. [1954]. Foundations of Statistics. New York: Wiley. 


ast 
4 
| 
| 
i 
q 


THE COMPARISON OF THE SENSITIVITIES OF SIMILAR 
EXPERIMENTS: MODEL II OF THE ANALYSIS OF 
VARIANCE* 


D. E. W. ScHUMANN AND R. A. BRADLEY 


University of Stellenbosch, Stellenbosch, South Africa 
and 
Virginia Polytechnic Institute, Blacksburg, Virginia, U.S.A. 


1. Introduction 


The appropriate scale of measurement may not be apparent in 
many experimental situations and this is particularly true when sub- 
jective appraisals of samples are required. In this paper we consider 
the possibility that two scales of measurement, or two experimental 
techniques, are under consideration. Preliminary experimentation 
may be required to decide on the better of the two scales of measure- 
ment. The better scale of measurement, or the more sensitive one, is 
by definition the one that better demonstrates treatment effects (Model 
I of the analysis of variance) or the existence of a between-treatments 
component of variance (Model II), depending on the model used. We 
assume preliminary experimentation in the form of two parallel ex- 
veriments, appropriate for use of analysis of variance, that are distinct 
nd independent in probability but which use similar experimental 
designs with separate samples from the same treatments. Cochran 
[1943] and Carlin, Kempthorne, and Gordon [1956] have discussed this 
situation earlier as have the present authors. 

This paper is not directly concerned with the relationships that may 
exist between two scales. Various such relationships are discussed by 
Cochran and, when the form of a functional association between scales 
is known, special techniques for comparing the sensitivities of the 
scales may be devised. The usual situation appears to be the one in 
which scales are monotonically related but in which the relationship is 
otherwise unknown. In order to have meaningful experiments where 


*Research sponsored in part by the Statistics Branch, Office of Naval Research. Reproduction 
in whole or in part is permitted for any purpose of the United States Government. Presented at the 
ENAR Meeting in Pittsburgh, Pa., March 19, 1959. 


405 


4 
ha. 
| 
| 
a 
t 
1 
| 
|_| 
3 | 


406 BIOMETRICS, SEPTEMBER 1959 


two scales are proposed, it is necessary that both scales measure essen- 
tially the same attribute of the samples but the attribute may be 
ill-defined as, for example, quality, performance, preference, - - - 

Schumann and Bradley [1957a, 1957b] developed the theory for 
comparing the sensitivities of two experiments where Model I of the 
analysis of variance is appropriate. The comparison of sensitivities 
was effected through a comparison of two independent non-central 
variance ratios from the two parallel experiments. We need now to 
review that work. 

F, and F, were defined to be two independent non-central variance- 
ratios associated with experiments | and 2, both with 2a and 2b degrees 
of freedom (most of the discussion was limited to similar experiments, 
that is, experiments yielding variance-ratios having the same degrees 
of freedom), where 2a and 2b may be odd or even, and with parameters 
of noncentrality \, and \, respectively. With Model I of the analysis 
of variance, the comparison of sensitivities depended on a test of 
significance of the null hypothesis, \,/k, = 2/k2 , against a suitable 
alternative, one-sided or two-sided. The hypothesis depends on the 
definition 


k; 744/20; ’ j =1, 2, (1) 


where 7,; is the effect of the ith of ¢ treatments measured from their 
mean, k; is the number of observations in each treatment mean, and 
o; is the population error variance, each of these specifications for 
experiment j. When k, = k, , the experiments were termed identical. 
Then the null hypothesis became }, = A, = A (say). Asa test statistic, 
the ratio of the two non-central F-statistics was used in the form 


= (2) 
where 
u; = 2aF;/2b, j= 1,2. (3) 


The distribution of a was derived in the first paper [1957a] on this - 


subject. It was also shown there that an adequate approximation to 
the distribution of w could be obtained from the distribution of 
w = U;/U2 ,u; = 2a'F;/2b, and F; , for experiment j, j = 1, 2, regarded 
as a central variance ratio but with 2a’ and 2b degrees of freedom where 


a’ = (a + d)*/(a + 2d). (4) 


A table was provided for tests of significance and showed values of w» 
such that P(w > w,.) = 0.05 for given a and b. When ti, is required 


=o, 


| 
8 
1 
4 
| 
i 
{ 
e} 
el 
tr 
{ 


SENSITIVENESS OF SIMILAR EXPERIMENTS 407 


such that P(w > v,) = 0.05, a and ) are used to obtain a’ in (4) and 
a’ is used as the value of a for entering that table. For full details 
on the use of the table in applications, the reader is referred to the 
paper [1957b]. 

The purpose of the present paper is two-fold: 


(i) To consider ways of comparing the sensitivities of experiments 
when Model II (random effects) of the analysis of variance applies, and 

(ii) To make available additional tables of critical values of w which 
have been completed since the publication of the original table. 


2. Sensitivity Comparisons with Model II 


Let us consider Model II of the analysis of variance with additive 
models such as 


Ye t=1,---,t} a=1,---,k (5) 
in the one-way classification, or 
Ve = at Bat Ge; t=1,-°-,t; a=1,--: ,k (6) 


in the two-way classification. (More complicated models may be 
treated similarly and indeed the example below is for a more complex 
pair of experiments.) In (5) and (6) y;. represents the ath observation 
in the 7th group, or perhaps the observation for the ath batch of the 
ith group. yu is the usual constant mean and the 7,’s are independent, 
normally distributed, variates with zero means and variances o} repre- 
senting the group effects. The 8,’s may be similarly defined if desired 
and the ¢,,’s, or residual effects, are independent, normally distributed, 
and with zero means and variances o”. The n’s, 6’s and e’s are assumed 
mutually independent. The usual analysis-of-variance tables and 
variance components may be constructed. 

We shall consider the group effects and the residual effects only and 
associate these problems with our earlier notation by using 2a for the 
degrees of freedom associated with the mean square for groups and 2b 
for those associated with the error or residual mean square. (Again, 
a and b may be integers or half-integers.) It is well known that the 
expected mean square for groups is (o° + ko) and that the expected 
error mean square is ¢. The corresponding sums of squares are dis- 
tributed as x3, («° + ko?) and as 3,0” and their ratio 


2 
r(1 + 4%) (7) 


has a distribution depending on F with F having the central variance- 


| 
| 
4d 
2 
met 
| 
he 
E 
4 


408 BIOMETRICS, SEPTEMBER 1959 


ratio distribution with 2a and 2b degrees of freedom through the in- 
dependence of x3, and x3, . We use x; to denote a central chi-square 
variate with f degrees of freedom. 

Corresponding with the assumption of real treatment differences 
when considering Model I, we now assume that of * 0 and define 


y = (8) 


in which y/k is a measure of the group-variability component in com- 
parison with experimental error, thus a measure of relative group 
variability. With postulated or expected variability among groups, a 
comparison of the sensitivities of two independent experiments using 
different scales of measurement, or different experimental techniques 
(or with different experimental errors), should be effected through a 
comparison of the relative group variabilities of the two experiments. 
Let 


j =], 2, (9) 


be the parameters of relative group variability for two independent. 
experiments. The null hypothesis of equal sensitivities is 


Ho = Y2/ke (10) 


with k; being the number of observations in each group mean in ex- 
periment j. As test statistics we use v, and v, ; following the definition 
in (7), we have 


v; = 2a;F (1 + 7;)/2b; , j= 1,2. (11) 


F; is now the F-statistic with 2a; and 2b; degrees of freedom for the 
jth experiment. 

For tests of Hy in (10), we use the ratio v,/v, or a constant times 
that ratio. Let = Y2/ke = q and = a,F; (1 + kjq)/b; = 
We define 

w= c'v,/v2 (12) 
where 


c’ = azb,(1 + k2q)/a,b.(1 + (13) 


The distribution of w can be obtained by replacing c by c’ in equation 
(24) of the earlier paper [1957a]. In this general case, approximate 
use of prepared tables may be made as suggested in that paper. Usually 
attention will be directed to the comparison of identical experiments 
with a, = a,,6, = b,, andk, = k,. 


x 
7 
: 
| 
| 
2 


B)) 


on 
ate 
lly 
nts 


SENSITIVENESS OF SIMILAR EXPERIMENTS 409 


When identical experiments are to be compared, H, reduces to 
¥1 = Y2 , and we use w = v,/v. = F,/F, as the test statistic. Now we 
can use prepared tables directly to obtain w, such that P(w > wy) = a, 
needing only the values of a and b to enter a table. (A table with a = 
0.05 was published earlier [1957b] and a revision of that table along with 
tables with a = .025, 0.01, and 0.005 are given with this paper.) 

The form of the density function of w, g(w; a, b) is given in (26) of 
[1957a], and the properties of this function and the corresponding 
distribution function G(w; a, b) are discussed there. For the test 
procedure, in practice, we use w = F,/F, with 2a and 2b degrees of 
freedom and test Hy) : y, = v2 against a one-sided alternative (i) 
H.: 71 > Y2 (oF > 7¥:) or a two-sided alternative (ii) H, y2- 
The test procedure is to reject H, with significance level a in favor of 
H, (i) when w > w,(a@) [or 1/w > wp(a)], or with significance level 2a 
in favor of H,(ii) when w > w,(a) or 1/w > wo(a). If we do not assume 
7: = Y2, w may be taken to be (1 + y2)F,/(1 + 7,)F.2 and the tables 
may be used to obtain confidence limits on (1 + y2)/(1 + 7:). 


3. Tables 


Tables 1, 2, 3, and 4 contain values of w,(a) for a = 0.05, 0.025, 
0.01, and 0.005 and for values of 2a of 2(1)21 and 30, 40, 60, and 120, 
and values of 2b of 2(2)20, 30, 40, and ©. Table 1 is a repetition of 
the table published earlier, but with additional entries, and Tables 2, 
3, and 4 are new. 

Some difficulties were encountered in calculating table entries for 
the larger values of b except for b = ©. The difficulties entered from 
loss in accuracy through extended use of recursion formulas. Con- 
sequently, for b = 15 and 20, only one-decimal accuracy was attained. 
Also some values of wy for b = 10 may differ by unity in their second 
decimal places from the true value, although it was judged useful to 
show values obtained to two decimal places. 

The tables are essentially triangular for 


a, b) G(wo b, a) 
or 


il g(w;a, b) dw = / g(w; b, a) dw. 


4. An Example 


Kauman, Gottstein and Lantican [1956] considered numerical 
and subjective methods of evaluating the quality of dried veneer. The 


Pape 
: 
i 
a 
| 
i 
| 
4 
A 
a 
1) 
) 
7 
es 
| 
9) 


BIOMETRICS, SEPTEMBER 1959 


SSSR 
AN 
ANNAN 
MOD DMM MAN ANN 
o 
+ 4 RSs 
a | SS 
tt 


TABLE 1 
10 

8.62 

4.70 

4.52 


23.10 


24.37 
12.81 
9.32 


VALUES OF Such Tat a, b) dw = 05 


26.76 
14.40 
10.62 


VALUES OF wo SucH TuHat afen- 


32.76 
18.35 
13.91 


66.12 
40.81 


41 
| 
4 
| 
no 
on 
~ 
— 


| 1.35 


SENSITIVENESS OF SIMILAR EXPERIMENTS 411 
R HHH OM MMM AN 
| g |S888R 85883 
o | @ |888S5 $8282 
=~ 
o | q 
HH 
PI 
| 
= 
a]. | 
~ | 858 
Sa 
o 
a N.N_AN a an 


is 
; 
| 
| 
4 
te 
| 
| 
| 
| 
| 


BIOMETRICS, SEPTEMBER 1959 


412 


| 

| 

| 9'E 6'¢ 

| | OF | 99 F | 98°F 

| + 61'¢ 69'S 20°9 

OL€ | | GE'S | 19'S | 68'S | 82°9 

| o's €9°¢ €1°9 20°2 

| E'S 9°¢ 92°2 £9°8 

| 2°¢ 0'9 89°9 26°9 

£0°9 | 6'9 86°2 69 °6 99 

66°9 | 62 60'6 OT 68° OT 1611 69° 92° LT 
| 6'6 8Z'OI | ZO'IT | br IT | 96'IT 69 92° TZ'ST 09 
| | Z8 | | | ST 06°ST €0'L1 '6I 
9F 62] | | HE | 69' FE | SH'SE | 06° LE 90 ' OF 99 26 
00°66} T° 6° SOT!) ZO ITT] ZO'SIT | 92° SIT | EST | 68° OEL | OF OFT 
OF 0€ 02 81 91 FI ral or 8 9 
02 ST or 6 8 Z 9 ¢ £ 


10° = ™p (q ‘v ‘m)6°2 LVHY, HONG °M JO SAN IVA 
€ ATAVL 


| 
Oooo 
eres 
12 
tan 
| cs 


09 
08s | OF 02 
| 0g SI 
ze | ob | 1Z 2/1 
zee | er | OF | 02 ol 
ere | tr | | | 61 2/61 
| ot | | | 6 
| | oe | | 119 | LI 
we | | 209 | 189 | 89°9 91 8 
or | | | | | SI 
> wt |e9 | | O92 | | | ZI 9 
| | | | | | 968 | 10°11 II 2/11 
S |69 | | 8's 19°8 | | | 9F OT | or 
#9 | 18 | 806 | | 886 | | FEIT | 29°31 | 6 2/6 
& | 98 | 16 | | | | 9¢°ZI | 96°ST | 8 
| 1 or | | | | | | SE | SB ST | ST | 80°F L 
| @ | | og | | OG'ST | GO'OT | | | 89°TS | 9 
| | | | 98°61 | | | | 92° | FE | 9E'9S 2/¢ 
S | | | 60°82 | 29°8Z | GF'6Z | 09°0E | | | | 
= 00'661| 20°22] TES | SES | TO'SFZ | | | | LT if 
= o | or | o¢ | oz | st FI rai or 8 9 ive 
02 | SI 6 8 9 ¢ I q 
COO" = Mp (q ‘m)6°2 f HONG °M AO SAN TVA 
oD 
| | | | | | | | of 


+4 
“| 
‘ 
| 


414 BIOMETRICS, SEPTEMBER 1959 


quality ratings varied respectively from 0 to 50 and 0 to 8 with 0 in 
each case indicating a sheet free of degrade. Both systems were some- 
what subjective but they did result in quite separate scales. 

Kauman ef al. used 20 sheets, 3 observers, and 2 spaced replications 
for each observer on each sheet. Judgings by numerical and subjective 
schemes and for the two replications were taken to be independent as 
required for the valid use of the methods of this paper. An attempt to 
achieve this independence was made through rerandomization of the 
sheets for each set of judgments along with spacing of the various 
parts of the experiment in time. The authors assumed Model II of the 
analysis of variance. For the general procedure with m sheets, n 
observers, and p spaced replications, their analysis has the general 
form of Table 5 in which we show the expectations of mean squares. 


TABLE 5 


Tue GENERAL Form or ExPecTED MEAN SQUARES FOR THE 
EXPERIMENTS OF KauMawn et al. 


Source df. Expected Mean’ Square 
Sheets (S) m—1 + Paso + 
plications (within o 
servers) (R) n(p — 1) o%R + mo 
Sxo (m — 1)(n — 1) ose + Pogo 
SXR n(m—1)(p—1) | 


Interest centers on the method that shows the larger relative com- 
ponent of variance for sheets in that that method may be regarded to 
be the more sensitive in distinguishing veneer-sheet quality differences. 
This experiment is not simply a one-way or two-way classification as 
discussed at the beginning of Section 2, but the methods developed for 
comparing sensitivities may be used here and in many other similar 
situations. 

We may associate o%, + poso with o° and npo? with ko? of our 
earlier development. The mean square for sheets corresponds to the 
mean square for groups and the S X O mean square with the mean 
square for error. Further k = np and y = npo;/(o%z + pox). The 
null hypothesis becomes H, : 7, = 72 for the two independent identical 
experiments. If F is the ratio of mean square for sheets to the S X O 
mean square, w = F,/F, , again for the two identical experiments. 

Kauman et al. show their data and analyses and in particular give 


| | 
q 
/ 
3 
‘ 
< 


SENSITIVENESS OF SIMILAR EXPERIMENTS 415 


analyses of variance in their Table 6. We only note that m = 20, 
n = 3, p = 2, F, = 31.15, F. = 22.36, and 2a = 19 while 2b = 38. 
We take experiment 1 to be with the numerical scheme and experiment 
2 with the subjective scheme. It follows at once that w = F,/F, = 1.39. 
The null hypothesis is Hy : y, = y2 , and we choose H, : y; > y2 with 
the thought that the more complicated numerical scheme would have 
been judged the more sensitive a priori. 

The analysis is essentially complete with the test of significance and 
a might have been chosen as 0.05. Then wy, (0.05) = 2.6 from Table 1 
(entered with 19 and 38 degrees of freedom) and the observed w is 
nonsignificant. 

The 0.95-confidence limits for (1 + 2)/(1 + y:) are based on entries 
from Table 2 and the rejection rule for the two-sided test. Now 
w = (1 + y2)F:/(1 + 7:)F2 and the confidence interval is based.on the 
probability statement 


+ n)F2 < (0.025) | = 0.95, 


where wy (0.025) is determined from the table using 19 and 38 degrees 
of freedom. Insertion of F,/F, = 1.39 and w» (0.025) = 3.1 yields the 
confidence interval, 


0.23 < (1 + y2)/(l + 11) < 2.28. 


These analyses support the conclusions of Kauman, Gottstein, and 
Lantican, namely that “The present experiment has shown that sub- 
jective evaluation can yield results of an accuracy approaching that of 
the numerical scheme, although the accuracy of the latter was slightly 
superior.” 


5. Summary and Discussion 


This paper extends earlier work on the comparison of sensitivi- 
ties of experiments based on Model I of the analysis of variance to such 
comparisons for experiments based on Model II of the analysis of 
variance. The procedures are similar to but somewhat simpler than 
those developed earlier and require the same tables. 

Along with the new applications developed, we have included new 
and extended tables for the distributions of the ratio of two central 
variance-ratios with equal pairs of degrees of freedom. Tables of 
critical values of that ratio w show values of w,(a) such that 
Plw > w(a)] = a for a = 0.05, 0.025, 0.01, and 0.005 and for variance- 
ratio degrees of freedom 2(1)21, 30, 40, 60, and 120 in the numerator 
and 2(2)20, 30, 40, and © in the denominator. 


=. 
= 
if 
| 
| 
4 


416 BIOMETRICS, SEPTEMBER 1959 


An example is given using data previously published and both the 
test of significance on sensitivities and a confidence interval are obtained. 
Conclusions regarding the example are in agreement with those stated 
by the authors of the example. 

The work by the authors to date has been limited to sensitivity 
comparisons for two separate independent experiments involving 
samples from a single set of treatments. Very often in practice experi- 
menters will make several measurements or obtain several scores on 
each sample in a single experiment. The problem now becomes one in 
the domain of multivariate analysis but interest may still be on the 
selection of the variate that alone is most sensitive in the sense developed 
here. Additional research is required to solve this related but different 
problem and the methods of the present paper should not be applied 
in such situations. 


REFERENCES 


Bradley, R. A. and Schumann, D. E. W. [1957b]. The comparison of the sensitivities 
of similar experiments: applications. Biometrics 13, 496-510. 

Carlin, A. F., Kempthorne, O., and Gordon, J. [1956]. Some aspects of numerical 
scoring in subjective evaluation of foods. Food Research 21, 273-81. 

Kauman, W. G., Gottstein, J. W., and Lantican, D. [1956]. Quality evaluation by 
numerical and subjective methods, with applications to dried veneer. Biometrics 
12, 127-53. 

Schumann, D. E. W. and Bradley, R. A. [1957a]._ The comparison of the sensi- 
tivities of similar experiments: theory. Ann. Math. Stat. 28, 902-20. 


Note Added in Proof: As this issue of Biometrics was in page proof, 
the following paper was published: Bross, I. D. J. [1959]. Note on an 
application of the Schumann-Bradley Table. Ann Math. Stat. 30: 
581-83. Bross develops the same tests of significance shown in this 
paper, but does not consider the confidence interval nor provide any 
additional tables. 


“a 
A 
| 
| 
| 
| 


THE USE OF THE POWER FUNCTION TO DETERMINE AN 
ADEQUATE NUMBER OF PROGENY PER SIRE IN A 
GENETIC EXPERIMENT INVOLVING HALF-SIBS* 


STaNLEY WEARDEN 
Statistical Laboratory 
Kansas State University 
Manhattan, Kansas, U.S.A. 


Introduction 


Genetic experiments which involve large animals are often limited 
in scope by factors outside the experiment. Due to a lack of sufficient 
time, manpower, and space, in addition to the expense of such under- 
takings, there are few extensive genetic experiments using large animals. 
Instead, the experimenter must use whatever data or animals are 
available, and he needs to know when he has an adequate number of 
observations per genetic group to undertake a genetic study concerning 
an estimate of heritability of a magnitude of practical importance. 

A common genetic group used in studies of large animals consists of 
paternal half-sibs. Such is the case in most progeny tests, and one of the 
most common methods of estimating heritability is a function of the 
paternal half-sib correlation. However, the reported estimates of the 
heritability of a given quantitative trait are often as divergent as the 
bounds (zero to one). Consequently the geneticist is led to wonder 
what constitutes an adequate sample of a sire’s genes. This paper 
proposes a method for determining the number of observations required 
per genetic group in order to detect significant differences among groups 
when the heritability of the trait is at least as great as some prede- 
termined, minimum magnitude. 


Heritalnlity in the Narrow Sense 


The use of heritability assumes the linear model for the character- 
istic under consideration to be 


(1) 


*Contribution No. 41 of the Statistical Laboratory, Kansas Agricultural Experiment Station, 
Manhattan, Kansas. 


417 


we 
al 
Be 
- 
of 
’ 
in 
0: 
. 
ny 


418 BIOMETRICS, SEPTEMBER 1959 


where i = 1, --- ,k andj = 1, --- ,,. Thus the individual record 
(Y,,;) is the sum of an effect (u) common to all the animals, a random 
effect (s;) due to the sire of the animal, and another random term (e,;;) 
which contains both genetic and environmental effects peculiar to the 
record of that particular animal. 

If the analysis of variance for the completely randomized design 
under Model II is used, the following expected values of mean squares 
are obtained: 


E (Among Sires M.S.) = o2 + no%, (2) 
E (Progeny within Sires M.S.) = o2 . (3) 


In (2) and (3), n represents the number of progeny per sire*, a; is the 
variance due to the sire effects, and o2 is the pooled variance among 
progeny within sire groups. Using this analysis of variance, the paternal 
half-sib correlation is simply the intra-class correlation as described by 
Fisher [1954]: 
@) 
If it is assumed that the only uncontrolled factors which affect an 
animal’s record are either genetic or randomly environmental, and that 
these are uncorrelated, then the total variance of an observed character- 
istic («? + o2) can also be expressed as the sum of the genetic variance 


(o?) and the environmental variance (c7). In (1) the sire effect may also 
be expressed as 


8; = (5) 


where 7; is the deviation of the ith sire from the mean with respect to 
the characteristic under study, and ¢ is one-half of the fraction of that 
variability which is heritable, for an offspring samples only one-half of 
each parent’s autosomal genes. If the progeny of all sires are treated as 
nearly alike as possible, o° is the genetic covariance among half-sibs. 
When it is assumed that additive gene action is the only form of in- 
heritance affecting a trait, the genetic covariance (c,;;) between the 
ith and the jth animals is some function of the additive genetic variance 
(aio). Lush [1948] has shown this relationship to be 


*In the event there is an unequal number of progeny per sire, n in (2) represents 


| 


GENETIC EXPERIMENT INVOLVING HALF-SIBS 419 
where a,; is the coefficient of additive relationship between the two 
animals; the notation is that of Henderson [1954]. Since the total 
variance has been defined as the sum of the genetic variance (67 = ojo) 
and the environmental variance (c%), using (4) and (6) the paternal 
half-sib correlation can be written as 


2 
QijF10 


Gio 7) 


Heritability in the narrow sense, symbolized by h’ in this paper, has 
been defined by Lush [1948] as the ratio of the additive variance to the 
total variance. Hence (4) must be multiplied by the reciprocal of the 
coefficient of additive relationship to obtain an estimate of heritability 


®) 
Through an algebraic manipulation of (8), it can be shown that 
2 
(9) 
ai; 


Power of the Test 


Scheffé [1956] has shown how one may determine the power (8) of the 
test for the analysis of variance with the infinite model (Model II). 
This procedure can also be used in an experiment to determine the 
number of observations needed per group in order, with given power, 
to detect differences among the groups. In this paper, the procedure is 
based on the probability of detecting differences among the sire effects 
(s;), since an estimate of heritability based on data which could not 
demonstrate any genetic differences which exist among sires would be 
of very limited value. The failure to detect a real difference among 
main effects (Type II error) is in part due to the size of the difference(s), 
which in turn depends, in this study, on the size of heritability. The 
probability of making such an error is 1 — 8. The probability of 
making a Type I error (i.e. concluding erroneously that there is a differ- 
ence among genetic groups) is determined by the value (a) in the 
F-table which is used to test the variance ratio. 

Assuming that the sires are a random sample from some population 
which can be defined by the experimenter, the power of the test is 


6(@) = P{F,,... 2 + (11) 


al 
4 
) 
| 
<< 
' 
at 
r- 
Ce 2 
7 
as 
4 
ce 
6 
4 


420 BIOMETRICS, SEPTEMBER 1959 


The power is thus seen to involve the central F-distribution.* In (11), 
v, is the number of degrees of freedom associated with the “among 
sires” mean square, vy, is that associated with the error mean square, 
F,,,,, isa random value, and F,,.,, ,,, is a tabular value. The parameter 
6 is the ratio of the sire variance to the within-sire variance, which by 
(9) can be shown to be 


(12) 
—-h 


If the power of the test is set at some particular value, and the 
corresponding F-value (F,,,.,,) is determined, then the equality in 
(11) is satisfied by 


= 


and n can be obtained in the following fashion: 


n= (Fess 1), (13) 


The solution of (13) is iterative, for the sizes of the two F-values depend 
on v, , Which depends on n. For a rapid solution to (13), the experi- 
menter should choose values of F, and Fs, corresponding to a large 
number of degrees of freedom in the error mean square, and then solve 
for n. He should then determine what », would have been had he 


‘actually used this number of progeny per sire. The appropriate F-values 


for this new number of degrees of freedom are then to be inserted in 
(13), and a new approximation of n calculated. A second, and perhaps 
a third, iteration may be required in order to fix the number of half-sibs 
per group at one particular value. 


Example 


Suppose an experimenter wishes to estimate the additive genetic 
variance or conduct a progeny test with respect to a certain trait. In 
this example, suppose the experimenter does not desire to detect differ- 
ences among the sires unless the heritability of this trait is no lower 
than .6, and that he arbitrarily decides to set the levels of a and B at 
.05 and .75 respectively. Let it be further supposed that a random 
sample of 25 sires is available for the study. If the sires and their 


*As Scheffé [1956] pointed out, the non-central F-distribution as described by P. C. Tang [1938] 
is the appropriate tool to determine the power only when the main effects are fixed (Model I). 


wc ae. © 


=e 


| 
| 
| 
: 
2 | 2 
x t 
a 
i 
a 
t 
n 
—-——— —--- — 
4 


GENETIC EXPERIMENT INVOLVING HALEF-SIBS 421 


progeny are non-inbred, the coefficient (a;;) of additive relationship 
between half-sibs will be .25. 

Since the experimenter should choose a large number of degrees of 
freedom for the error mean square, let him use vy» = ©. Under these 
conditions, F, = 1.52, and 1/F, = 1.26* [Tables 10.5.3 and 10.18.1, 
Snedecor, 1956]. Substituting these values in (13), 

4— 6 
[(1.52)(1.26) — 1] = (5.67)(.9152) = 5.2. 

As it is impossible for a sire to have other than a whole number of prog- 
eny, letn = 6. With k = 25 and n = 6, the degrees of freedom associated 
with the error mean square are 25 (6 — 1) = 125. The table contains 
no value of 1/F, corresponding to vy, = 125, but it would be so close to 
the tabular value for vy, = 120 that no interpolation is necessary. Using 
the new F’-values in (13), 


.60)(1.28) — 1] = 5.9. 


Thus it can be seen that the appropriate number of progeny per sire is 
between 5.2 and 5.9. To avoid a fractional number of progeny, the 
experimenter should have six progeny of each of the 25 sires. If there 
is an unequal number of progeny per sire, n represents an approxi- 
mation** of no as described earlier. 

This procedure is based on the probability of detecting a real differ- 
ence among sire groups when heritability exceeds some minimum 
magnitude. The experiment described in this example has a power of 
the F-test of .75, thus the probability of detecting a true difference 
among sires is .75 when the rejection level is set at .05. For varying 
numbers of sires, the calculated numbers of progeny per sire to be used 
in genetic experiments for various magnitudes of heritability, when a 
is set at .01 and 6 at .75, are presented in Table 1. The values for the 
‘ase When a = .05 and 8 = .75 are presented in Table 2. 


Summary 


The variance due to sires (o%) has been expressed as a function of the 
error variance (o2), this function being dependent on the size of herit- 
ability. The power of the test is used, along with the above function, 
to determine an adequate number of progeny per sire to detect a true 
genetic difference when heritability is of some specified minimum 
magnitude. This number is dependent on the probabilities of the Type 


or disproportional subelass numbers, @ is distributed only approximately as F. 


van 
i 
q 
in 
26 
Ae: 
: 
4— 6 — 
6 
aga 
ip 
4 
4 ay 
4 
—— 
|: 
| 


422 BIOMETRICS, SEPTEMBER. 1959 


TABLE 1 
NuMBER OF PROGENY NEEDED PER SIRE IN A GENETIC EXPERIMENT 
WHEN a = .01 AND = .75 


Sr 
‘i Number of Sires T: 
i Heritability 7 9 11 13 7 2 25 
10 152 117 96) 
” 20 76 58 48 41 33 28 25 
ice .30 50 38 32 27 22 19 17 
40 7 ss 13 
50 29 22 18 16 13 11 10 
60 24 18 1h 13 11 9 8 
TABLE 2 
NuMBER oF ProGeNy NEEDED PER SrrE IN A GENETIC EXPERIMENT 
WHEN a = .05 AND B(@) = .75 
Number of Sires 
Heritability 7 9 11 13 17 21 25 
.10 104 81 67 59 47 41 36 
.20 52 40 33 29 24 20 18 
.30 34 26 22 20 16 14 12 
.40 25 20 16 14 12 10 9 
.50 20 16 13 12 10 8 rf 
| 60 6 13 0 6 
“4 I and Type IL error which the experimenter wishes to use. Calculated 


numbers of progeny per sire for varying numbers of sires and different 
size estimates of heritability are presented in tabular form, according 
to the probabilities of the Type I and Type II errors. 


ACKNOWLEDGMENT 


The author would like to acknowledge the suggestions and criticisms 
of Miss Lorraine Schwartz during the preparation of this paper. 


REFERENCES 


; Fisher, R. A. [1954]. Statistical Methods for Research Workers, 12th ed. Oliver and 
Boyd, Edinburgh. 

Henderson, C. R. [1954]. Animal husbandry 120 notes. Mimtographed lectures, 
Cornell University, Ithaca. 


G 


GENETIC EXPERIMENT INVOLVING HALF-SIBS 423 


Lush, J. L. [1948]. ‘The genetics of populations. Mimeographed lectures, Iowa State 
College, Ames. 
Scheffé, Henry. [1956]. The analysis of variance. Mimeographed lectures, Uni- 
versity of California, Berkeley. 
Snedecor, G. W. [1956]. Statistical Methods, 5th ed. Iowa State College Press, Ames. 
Tang, P. C. [1938]. The power function of the analysis of variance tests with tables 
and illustrations of their use. Stat. Res. Memoirs 2: 126. 


t 
ig 
i 
14 
4 
ae 
a 
) 
4 
ayy 
| 


CONFIDENCE LIMITS FOR THE LD, USING THE 
MOVING AVERAGE-ANGLE METHOD 


EvGene K. Harris 


Robert A. Taft Sanitary Engineering Center 
Public Health Service 
Cincinnati, Ohio, U.S.A. 


Introduction 


The use of moving averages as a smoothing procedure is well known 
to statisticians. Moving average interpolation for LD, estimates was 
proposed by Thompson |1947] and recently extended by Bennett [1952} 
to include the angular transformation. The purpose of this trans- 
formation is, of course, to make the variance of an observed proportion 
largely independent of the unknown true proportion. Thompson and 
Weil [1952] and Weil [1952] have published formulas and tables useful 
in the calculation of approximate confidence limits for the LD5. without 
reference to the angular transformation. 

The following sections present ‘“exact’’* confidence limits and 
significance tests under the moving average-angle procedure, when the 
doses are separated by a constant interval (almost always a log interval 
in bioassay) and an equal number of animals has been exposed to each 
dose. Under the moving average-angle method, formulas to handle 
differing intervals and/or numbers of animals may be easily derived 
when needed (cf. Bennett, [1952]). 

The observed proportional response at each dose is transformed to 
an angle, 6(p) = arcsin ~p, using, for example, Table XII in Fisher 
and Yates [1948]. When p = 0 or 1, Bartlett’s adjustments (cf. 
Eisenhart, {1947]) should be applied, namely = arcsin ~/1/4n and 
6(1) = 90 — (0), where n is the number of animals at each dose. Then 
simple averages of k successive angles are computed, where k is usually 
3 or 5. Each average angle is associated with the middle dose of the 
respective set of k doses. _ 

The LD,, is estimated by linear interpolation between the two 


*Not strictly exact, of course, since, as noted below, the observed rather than expected propor- 
tional response is transformed to an angle. 


424 


— 


4 
er 
4 
‘ 
: 
] 
= 
+ 


CONFIDENCE LIMITS FOR THE LDx 425 


successive doses whose average angles bracket 45°. Thus, if y and y’ 
are adjacent average angles, such that y < 45 < y’ and if z and 2’ 
denote the corresponding log doses, then 


Est. log LDso = + — = u| 


Confidence Limits for the LD so 


Considering the dose fixed, confidence limits for the LDs.) are based 
on limits for A = (45 — y)/(y’ — y) = X/Y, say. 

Under the assumption that average angles are normally distributed, 
confidence limits for A are available from Fieller’s theorem ({1944]; 
also see Geary, [1930]). These are 


Cov (X, Y) 


Var Y Z 


xX — 
{Var 2A Cov (X, Y) 


2 
+ A’ Var Y — Var |} 


A- 


Var Y 


where Z is the normal deviate corresponding to the two-sided confidence 
level and g = (Z’ Var Y)/Y” must be less than unity to assure that 
y’ — y is significantly different from zero. Now, 


Var X = C’/nk, 
Var Y = Var y’ + Var y — 2 Cov (y, y’) = 2C?/nk’, 
and Cov (X, Y) = Var y — Cov (y, y’) = C?/nk’, 


where C? = (90/x)? = 820.7 is a constant with respect to the trans- 
formation to angles in degrees. 

Since Var X = (k/2) Var Y, and Cov (X, Y) = (Var Y)/2, confidence 
limits for A may be expressed entirely in terms of A, k, and g, where 
g is 1641.4 Z?/nk?Y*. These limits are: 


Confidence limits for the log LDso are then x + (x’ — x) A, and 
z + (2’ — x) Ay , where A; and Ay are respectively the lower and 
upper limits for A. The tables below list A; and Ay for A = 0 (.05).50 
and g = .O1 (.01).10; .10 (.05).80, when k = 3 and 5. Confidence 
limits for a value of A, say A’, greater than .5 may be obtained from the 
limits for A = 1 — A’ since Af = 1 — A, and Aj = 1 — Ay. This 
may be seen from the following argument, for which the writer is 


> 
: 
: 
=| 
|. 
| 
| 
n 
4 
ie 
Wie 
0 
| 
e 


426 BIOMETRICS, SEPTEMBER 1959 


indebted to one of the referees. The expression under the radical sign 
in (1) has identical values for A and A’ when g and k are fixed. Writing 
Ag = [(A — 39)/(1 — 9)] + W and A, = [(A — 3g)/(1 — g)] — W, then 


When g > .85, limits for A are so wide that unless (x’ — 2) is very 
small, the LD,, estimate is practically worthless. In fact, the slope of 
the dosage-response curve has been badly overestimated, resulting in 
too small an interval between doses and a serious waste of experimental 
animals. In an attempt to salvage the situation without repeating the 
experiment (on an altered dosage scale), we might try skipping a dose 
and interpolating between non-adjacent average angles which bracket 
45°. This device will not necessarily succeed, however, since Var Y 
will now equal 4C*/nk’, or twice its value for adjacent averages, and 
the new value of Y’ would have to be much more than double the old 
value in order to achieve a substantial reduction in g. 

It may be noted from (1) that, for given g, the limits will be wider 
for k = 5 than for k = 3, and this is, of course, confirmed in the tables. 
In analysis of a specific series of experimental results, however, g will 
be considerably smaller when derived from 5-point moving averages 
than from 3-point averages since g varies as 1/k’, and the difference 
will be sufficient to ensure narrower confidence limits. 

More complete tables, including results for g = .01 (.01).30, are 
available on request. 


Example 


The following data were obtained by Mr. C. Henderson, biologist 
at the Sanitary Engineering Center, and concern the toxicity of the 
insecticide dieldrin in hard water (pH : 8.2, hardness: 400 ppm measured 


Dose Log Per cent 3-point 
(ppm X 100) Dose Mortality Angle Moving Avg. 
5.62 .750 100 80.9** 
4.21 .625 100 80.9 75.1 
3.16 .500 80 63.4 63.1 
2.37 .375 50 45.0 47.2 
1.78 . 250 30 33.2 29.1 
1.33 .125 0 


*0(0) = arcain Vi/40 = 9.1. 
**9(1) = 90 — 0(0) = 80.9. 


| 


CONFIDENCE LIMITS FOR THE 427 


as calcium carbonate) to fathead minnows (2. promelas) exposed for 
72 hours in groups of ten fish per dose. 


Est. (Log LDs. + 2) = .250 + (. 125)( 45 — 29.1 ) 
= 250 + (.125)(.88) 
= 360. 


Est. LD5. = .0229 ppm. 


1641 .4(1.96)° 
For 95% confidence limits, g= (18.190 = 21. 
1 — Ay (where A is .12) 
— = .364. 
1— A, 
1 + .599 = 1.599. 


Il 


A’, (where A’ is .88) 


95% confidence limits for log LD,. + 2 are 


250 + (.125) (.364) = .296, 
and .250 + (.125) (1.599) = .450 


or for the LD5» : .0198-.0282 ppm. 


Testing the Significance of the Difference Between Two Estimated LD;,’s 


If m, and m, < m, denote the estimated log LD,,’s for materials 
1 and 2, respectively, then the variable tested here is 


It is commonly recognized that dosage response curves must be 
substantially parallel in order that a comparison of LD,,’s alone may 
serve as a general contrast between the effects of two materials. Assume, 
then, as a condition of the test that the ratios (yf — y,)/(xf — 21) and 
(y'2—Y2)/(x'2—X2) may be replaced by a common slope b, . Thus, 
consider m,— mz = (41-22) + The terms — y,) and 
b. are independent and assumed normally distributed. A lower con- 
fidence limit for their ratio is given by 


m, — Mm, = (x, — 22) + 


(y2 — — — h) Var (ys — + — 
— h) 


***This value was obtained by first interpolating linearly between the upper limits for 4 = .10 


atg = .20 and .25 (.602 and .665) to obtain .615 at g = .21, and similarly for A = .15 to obtain .668. 
Then, simple interpolation between these two values gave Ay = .636 at A = .12,g = .21. 


A similar 
process using the lower limits yielded Az, = .599. Both limits agree with results given by direct appli- 
cation of (1) to within one unit in the third decimal place. 


j 
|: 
wi 
“SE 
| 
re 
| 
| 
ot 
| 
e a4 
| 


R. 1959 


BIOMETRICS, SEPTEMBE 


£28°0 
818°0 
€92°0 
£F0'0- 
2¢9°0 
¢09°0 
122 
FOS 0 
ere 


£0€ 
269°0 
661 
bro 0 
€10'0- 
162 '0 
842 '0 
002 


og" 


or’ 


428 
SS SR BSE 
a0 
SS SS SS 
ss 
! | | | 
ne coo eecoeco 
4 | | | 
5 SLAW TA HH HSH COCR 
RE 82828 $2 88 88 $8 8883 
BI AAAA WO ONKRAN 
| 
ARASHSSSSS FRSA BA SASS 
oO +o 
SB 
HAHN HH OM 
Ye) 
f 


CONFIDENCE LIMITS FOR THE LD» 


129 CB I— SSEI— TIO I— SI9'0— 009°0- GZ" 
809°€— 868'%— O80'S— LEL'I— 86h I— 96Z'I— 996'0— 269°0— 929°0- 02° 
98L'E— GIIT'E— 629 989'I— 69F I— 622'I— ZI8 0- ¢0° 
8ST G6IT'T 2Z290T 622°0 69990 Z19°0 
860°'F— IZE'S— GES Leb I— O6S'I— L8EI— LOSI— I— 00° 
colt 61860 #280 £80 22:0 1220 9190 09g'0 
08° GL 09° os" OF 0g" 

5 V 


(panuyuo)) § = ¥ 


T 


*h 


429 Hee 
} 
: 
| = 
4 

) 

) 

| 


BIOMETRICS, SEPTEMBER. 1959 


ZIT'O- 020° 0- F100 0s0°0 680°0 €81'0 OF 
068 °0 SEs ¢08°0 €2L'0 £02 °0 €19°0 oss 
290°T 296°0 9€8°0 6080 889 °0 TI9°0 00s 
800°T T06°0 €82°0 2S2°0 002°0 029'0 2£9°0 109°0 ost 
98T'O— OSTO— 120°0- $20'0— 820°0 ¢60°0 GZ" 
0S6°0 F0L'0 2L9°0 98¢°0 oss '0 0 00F 0 
€82°0- 229°0-— G6IEO— FEZ O— 090'0- or’ 
¢29'0 ost Str 86€°0 69€°0 £92 '0 PST 
02° or’ 60° 80° 20° 90° £0° 20° 10° 
g=7 
WIONY ONIAOW Ad ONILAANOD NI Wag) SLINIT 


4 
4 


431 


CONFIDENCE LIMITS FOR THE LDw» 


LES I— 68E'I— LOZI— +FFO'I— 968°0— 829'0— 209°0— OF" 
686'S— BSLV I— SE9I— I— 920'I— &t9'0— 0g" 
982° 860% 686'I 2291 I80°T 686°0 

5 

(panuyuoy) = ¥ 


ATAVL 


: | i 
Ke 
| 
|. 
|| 
1 
| 
| 
[at 
bo 
| | 
t 


432 BIOMETRICS, SEPTEMBER, 1959 


where h = Z’ Var b,/b? must be less than unity. The two LD,» estimates 
will be judged significantly different if (7, — x.) + R, > 0. 
In obtaining R, compute 


Var (y2 — 41) = 2C’/nk = 1641.4/nk, 


(x1 — + (a2 — 22)” 


N 
Say, 


and 


ll 


Var b, = 2C°/Dnk? = 1641.4/Dnk’. 


REFERENCES 


Bennett, B. M. [1952]. Estimation of LDso by moving averages. J. of Hygiene 50, 
157-64. 

Eisenhart, C. [1947]. Inverse sine transformation of proportions. Techniques of 
Statistical Analysis, Chap. 16, McGraw-Hill, New York. 

Fieller, E. C. [1944]. A fundamental formula in the statistics of biological assay, 
and some applications. Quart. J. Pharmacy 17, 117-23. 
Fisher, Ronald A., and Yates, Frank. [1948]. Statistical Tables for Biological, Agri- 
cultural, and Medical Research. Hafner Publishing Company, Inc., New York. 
Geary, R. C. [1930]. The frequency distribution of the quotient of two normal 
variates. J. Roy. Stat. Soc. 98, 442-46. 

Thompson, William R. [1947]. Use of moving averages and interpolation to esti- 
mate median effective dose. Bact. Rev. 11, 115-45. 

Thompson, William R., and Weil, Carrol S. [1952]. On the construction of tables 
for moving average interpolation. Biometrics 8, 51-4. 

Weil, Carrol S. [1952]. Tables for convient calculation of median-effective dose 
(LDs5o or EDs) and instructions in their use. Biometrics 8, 249-53. 


‘ 
t 


CONTRIBUTION TO THE STUDY OF GROUPED 
OBSERVATIONS. IV. SOME COMMENTS ON 
SIMPLE ESTIMATES* 


N. F. GsepDEBzK 


A/S Ferrosan 
Copenhagen, Denmark 


1. INTRODUCTION 


When studying coarsely grouped normal distributions, we must be 
able to take a view of the efficiency of the estimates of mean ~ and 
standard deviation ¢ which may be obtained by various methods. In 
previous papers by the author [1, 2] the efficiency of estimates based on 
grouped observations is measured against the efficiency of estimates 
from the corresponding ungrouped observations. For the maximum 
likelihood estimates of — and o we have listed the efficiencies arrived at 
from this point of view. The efficiency of the simple mean as estimate 
of the true one when oa is known has been calculated. It is found that 
in many practical cases the simple mean is not second to the maximum 
likelihood estimate. 

To go further with the study of the simple estimates of — and o of 
a grouped normal distribution we shall here investigate means and 
variances of these estimates in cases where one parameter is known as 
well as when both are unknown a priori. 

Say that from a normal population we have a sample grouped in 
(k + 1) groups, where k is limited in practice. Observations falling in 
the ith group are referred to some abscissa yu; lying between 2; and 
2:41 , Which are the limits of the 7th group. Thus we have: 


Number of Reference Probability 
Group limits observations point of the group 
— Nk Mk Py 
Totals N 1 


*Contribution to the Fourth International Biometric Conference, Ottawa, 1958. Publication was 
supported in part by a grant from the United States National Science Foundation. 


433 


1 
4 
Ay 
ete 
hi: 
Wa 
| 
le 
i 
i 


434 BIOMETRICS, SEPTEMBER 1959 


Here P; is 


Zits 1 { 
e€ — 
where ¢ and o are the parameters of the normal distribution considered. 
For known ga it has been previously found [2] that the simple estimate 


M= 


of ¢ has the mean 


{ M} 


i=0 


and the variance 


V{M} = - Pas) + (> Pin; — 


where >>'_, Piu; — & = B is called the bias of M. 
If we look upon the distribution of M on its own merits, we have the 
usual variance of this distribution equal to 


k 
i=0 


We shall, however, use M as an estimate of — and shall compare it with 
another estimate of this true value. So we shall here use the expectation 
of the square of the deviation from £ as the variance of M. 

As N goes to infinity, all the members of the M-distribution concen- 
trate more and more about their local true mean with a local variance 
going to zero. With respect to ¢, however, the variance of M is going 
to B®. We may say that the goodness with which M estimates ¢ is 
measured by the reciprocal of its variance with respect to ¢$. The same 
principle will be used for the variance of simple estimates of o’. 
Throughout the present paper the meaning of the term variance is this 
one. 

As M and the corresponding simple estimate of o” are inconsistent, 
it is not possible to speak strictly of their efficiency in the common 
sense of this word. We should, however, like to express in some way the 
goodness of our simple estimates as compared with estimates from 
corresponding ungrouped data. In our case it seems suitable to define 
the efficiency of a simple estimate as the variance of the estimate irom 
ungrouped data divided by the variance of the simple estimate in 
question. So we obtain that our inconsistent estimates have an asymp- 


4 
! 


VS 


Or 


i.e. S* is the mean of the squares of the obtained deviations from the 


true mean. 


For mean and variance of S’ we obtain by analogy to what we have 


for M that 


and 


= 


When both £ and o are unknown, we may calculate the following 
simple estimates of these parameters: 


and 


and 


2, 1 
= 


where we have put = M. 


THE STUDY OF GROUPED OBSERVATIONS 


totic efficiency zero. This seems acceptable. For N limited, our simple 
estimates have reasonable efficiencies in accordance with their usefulness. 

If now & is the known parameter, we may take the following simple 
expression S” as an estimate of o* 


1 
N 


For {M} and V{M} we obtain the same expressions as we find 
for « known, but for S’ we now have 


435 


1 k 
= N > nu; = 


k 
M{S*} = Pilui 


2. BOTH PARAMETERS UNKNOWN 


1 k 
M = N 
2 2 1 . 
= Vo] ni(u; — M)’. 


k 


P| BY - ny | 


k 2 k 
Plu. “ay = | | ay, 


|| 
4, 
ate 
Aes; 

1 

= 
tats 

yy 


436 BIOMETRICS, SEPTEMBER. 1959 


It turns out that the two first members of V {8°} are the same as 
those found when é is known, except that £ is now replaced by J. The 
third member of V{S°} may be written 2 [9{S’}]’?/N(N—1), and it 
reminds us not to use too small N. On the other hand, N need not be 
very great before this member is negligible. 

As N > o the first member of V {.S*} also vanishes, and so we have 
V{S?} — Ps — M)® — which is the square of the bias 
of S*. Accordingly, the standard deviation of S’ can never be smaller 
than the bias of S’. The same thing is shown for Jf [2]. 

Now let the greatest group interval — 0 while JN is limited. Then 
the bias of S’ goes to zero while the variance of S’ goes to 2o*/(N — 1). 
The efficiency of S* from grouped observations, as compared to the 
same estimate of o° taken from the corresponding ungrouped data, is 
therefore 20*/(N — 1) divided by the variance of S’ as derived above. 
If at this point we let N go ad infinitum, we find that the asymptotic 
efficiency of S’ is 0 because the bias of S’ is > 0. Hereafter S* seems 
inferior to the maximum likelihood estimate s of o which is derived in 
[3] and the asymptotic efficiency of which is finite as found in [1]. 


It will be seen that the same considerations may be made also if the 


distribution in question is not normal. Instead of 20°/(N — 1) we 
must only use the corresponding general expression for an arbitrary 
distribution. 

Unfortunately, all this is not of great significance as the bias of the 
simple estimate S’ of o” generally is too disturbing. In all cases where 
the grouping is not equidistant we must therefore use the maximum 
likelihood procedure, e.g. as outlined in [3]. For equidistant grouping, 
however, we may utilise Sheppard’s correction. 


3. EQUIDISTANT GROUPING 


If our grouping is equidistant, we may use the following simple 
estimate of o° 


Ss’ 


where h is the group-width. This is the Sheppard method. Then we 
obtain 


2 
= Pu — - 5 
and 


Vis} =— P, — My Pia, | 


( 

d 

s 

t 

n 

W 

I 

a 

b 

1 

n 

n 

W 

Pp 

4 

N 


THE STUDY OF GROUPED OBSERVATIONS 437 
instead of the corresponding expressions from §2. We find that the 
asymptotic efficiency of S? is 0 almost everywhere because the bias 


of S° is unequal to zero almost everywhere. There is now one point of 
singularity. Superficially, little seems to be achieved, but in fact the 
bias of S? is now so small that it is of little significance for limited N as 
will be seen from §4. 


4. EXAMPLES 


Look upon a sample from a normal population distributed in equi- 
distant groups of width h = 2. Let the true mean be 0 and the true 
standard deviation be 1 and define the phase-relationship between the 
true mean and the group limits as y, the abscissa of the group limit 
nearest to the true mean, divided by h. Then we shall here investigate 
the bias and efficiency of S’ for y = 0, 3, and 3. 

The bias of S° is 


Bs = — — 1.3333 


while the efficiency of S? is 


E, =2 — M)‘ — — + (N — 


as will be seen from §§2, [3]. 

We obtain the results given in the following table. The asymptotic 
efficiency [1] of the maximum likelihood estimate s of o is given at the 
bottom. 

The table reveals some interesting things, firstly that for N between 
10 and 100 the efficiency of S* is not much affected by By . If we can 
expect the efficiency of the maximunn likelihood estimate s to be approxi- 
mately the same for limited N as it is for N — ©, we also see that s is 
not much more efficient than S for relatively limited N. Secondly, if 
we pick out the singular point on the y-scale where Bs = 0 (it must be a 
point very near ¥ = 4), we have for this special value of y that Fy for 
N — o is very near EF, for N — . All in all, this shows that the 
Sheppard method is quite acceptable in many practical cases. 

The maximum likelihood procedure will have its advantages in the 
more complicated cases where the grouping is not equidistant or where 
N is a large munber, and great, accuracy is required. 


° 
2 
M 2 h 2 
— — 127? 
in 
| 
be 
: 
> 
2 


438 BIOMETRICS, SEPTEMBER. 1959 


TABLE 1 
BIASES AND EFFICIENCIES OF S? 
Group-Width h = 2 

Phase-Relationship y=0 y = 1/4 y = 1/2 
Bs 0.0316 0.0000 —0.0316 
> P(u — M) 4.673 5.203 5.725 
> P(u — MP 1.863 1.778 1.694 
Es for N = 10 68.7% 58.2% 50.3% 
Es for N = 100 68.5% 58.4% 48.5% 
Es for N = 1000 52.5% 58.4% 39.8% 
Es for N = 10000 15.6% 58.4% 14.3% 
Es for N > © 0 0 0 

E, for N — 54.1% 58.7% 63.5% 


For h smaller than 2 these comments are even more true, but for 
h greater than 2 the situation becomes intricate when using the Sheppard 
as well as when using the maximum likelihood method. 


5. CONCLUDING REMARKS 


From the preceding paragraphs we may deduce that for equidistant 
grouping with intervals less than twice the standard deviation the 
simple estimate of the variance completed with Sheppard’s correction 
practically is as efficient as the maximum likelihood estimate at least 
for a number of observations less than one hundred. The maximum 
likelihood method is, however, useful when the grouping is not equi- 
distant or when the number of observations is very large. The corre- 
sponding things are shown for the simple mean versus the maximum 
likelihood mean in the paper [2]. 


REFERENCES 
By the Author 


{1] Contribution to the study of Grouped Observations II, Skand. A ktuarietidskr. 
Vol. 39, p. 154, 1956. 

[2] Contribution to the Study of Grouped Observations IIT, Skand. Aktuarietidskr. 
Vol. 40, p. 20, 1957. 

[3] Contribution to the Study of Grouped Observations I, Skand. Aktuarietidskr. 
Vol. 32, p. 135, 1949. 


By Others 


Cornish, E. A. and Fisher, R. A. Revue de Vinst. int. stat., Vol. 5, 1937, pp. 307-20. 
Cramér, Harald. Mathematical Methods of Statistics, 1946, Princeton University. 
Fisher, R. A. Phil. Trans. Royal Soc. London, Series A, Vol. 222, 1921, pp. 309-68. 


a 
I 


20. 


68. 


THE STUDY OF GROUPED OBSERVATIONS 439 


Fisher, R. A. Statistical Methods for Research Workers, Oliver & Boyd, Edinburgh, 

Greenberg, B. G. and Sarhan, A. E. E. Some applications of order statistics. I. S. I. 
Bull. 36, 1958, pp. 172-83. 

Hald, A., Jersild, M., and Rasch, G. Acta Path. Microbiol. Scand., Vol. 20, 1943, pp. 
64-85. 

Kulldroff, G. Maximum likelihood estimation. Skand. Aktuarietidskr., Vol. 41, 
1958, pp. 1-36. 


| 
| 
i 
t 
| 
: 
by. 
1 
kr, 
= 
Bae 
ety 


SOME RECENT RESULTS IN CHI-SQUARE 
GOODNESS-OF-FIT TESTS* 


G. S. Watson 


Australian National University 
Canberra, Australia** 


Summary 


The aim of this paper is to relate and extend some recent work on 
chi-square goodness-of-fit tests. There is no discussion of any problems 
which are specifically associated with more than one categorical variable. 
The main topics are the effect of estimation on chi-square and its par- 
titions and their relation to Neyman’s smooth goodness-of-fit tests, and 
the effect of grouping a univariate distribution according to the dis- 
position of the sample on the distribution of the chi-square statistic 
and on the smooth test statistic. 


1. General Discussion 


This paper was prepared as a contribution to a symposium of recent 
work on chi-square goodness-of-fit tests. The topics covered are there- 
fore but a small selection of those which a complete review would 
consider and their choice is a personal matter. Moreover their interest 
is largely theoretical although not, it is hoped, of an unpractical nature. 
For the chi-square test is not only the oldest of the non-trivial signif- 
icance tests, and one of the most widely used, but it is also basic in 
statistics—many important concepts arose from its study and an 
understanding of it is a necessary background for the study of other 
statistical problems, pure and applied. In this section a general dis- 
cussion will be given of the areas touched in the paper which begins 
formally in Section 2. 

The criterion of Pearson*** 


*This paper was written while the author was visiting the Department of Experimental Statis- 
tics, North Carolina State College. 

Contribution to the Fourth International Biometric Conference, Ottawa, 1958. Publication was 
supported in part by a grant from the United States National Science Foundation. 

**Present address: Department of Mathematics, University of Toronto, ‘Toronto, Canada. 

+**Following Cochran we write X? for this statistic, rather than x?. 


440 


t 
\ 
£ 
il 
i 
n 
t 
p 
h 
tl 
= 
tl 
ir 
tl 
la 
th 
re 
of 
it 
= 


CHI-SQUARE GOODNESS-OF-FIT TESTS 44] 


(observed — expected)’, 
expected 


X? = 


though often decried, continues to be used more than any of its com- 
petitors, and for good reasons. For the true multinomial situation, the 
likelihood ratio statistics 


2> (observed) log, (sbecrved) 
expected 


have some theoretical advantages but for large samples they become 
equivalent and for small samples their behavior is similar on the evidence 
available. Lindley has suggested in the discussion to Watson [1958] 
that the likelihood ratio statistic lends itself to an analysis of information 
(in the sense of Shannon) corresponding to the familiar partitioning of 
X’ and that the relative powers of these two criteria should be in- 
vestigated, possibly by electronic computation. The most striking 
fact in this whole area is the satisfactory, but imperfect, way both 
criteria are approximated by the chi-square distribution despite their 
irregular and discrete distributions for small samples. Much study 
has been devoted to the question of when the chi-square approximation 
is adequate. Alternatively, improvements to the chi-square approxi- 
mation have been sought and some references to this work are given at 
the end of Section 3. It is just possible that other statistics, with similar 
powers, may be better approximated, a problem that does not seem to 
have been given attention. In summary of this field it is fair to say 
that there are as yet no distributional reasons for preferring other 
statistics to X* and that few errors of consequence are made by assuming 
that X° follows the chi-square distribution. 

The solutions to the problems which arise when the expected fre- 
quencies are functions of parameters requiring estimation represent 
important basic theorems of theoretical statistics. For small samples, 
the cases where the parameters admit sufficient statistics may be treated 
by using the conditional distributions of the criteria with the sufficient 
statistic held constant. In the case of 2 X 2 tables this device is well 
known and tables permit its easy use with marginal totals up to 20; 
beyond this point the chi-square approximation will be adequate. This 
latter assertion implies that, in large samples, the ordering of tables by 
X’ is the same as by their conditional probabilities or that, in view of 
the unstated nature of the alternatives, the slight differences in the 
rejection regions are not important. Since the conditional distribution 
of X* is free of nuisance parameters, it would be helpful to know that 
it too, when the sample increases, tends to the chi-square distribution 


a: 
| 
i) 
liga 
bx 
le. 
: 
is- 
tic 
ont 
re- 
est 
ire. 
nif- 
in 
her 
i 
i 
: 
i 
be 


442 BIOMETRICS, SEPTEMBER 1959 


obtained unconditionally in the limit. This fairly obvious result which 
seems to have been implied but not proved in the past is demonstrated 
in Section 3. It is true whether we use for “expected,” the sample size 
times the cell probabilities with their parameters replaced by their 
maximum likelihood (m.1.) estimators or their conditional expectations. 

The conditions under which estimation of parameters leads simply 
to a reduction of degrees of freedom are set out in Section 2, along with 
a discussion of the effect of estimation on single degrees of freedom. 
The classical result is that a degree of freedom is lost for every parameter 
fitted if m.l. (or asymptotically equivalent) estimators are used, 
provided these estimators have a bias going to zero faster than N~'” 
and variances and covariances of order N~’, the usual situation. If 
estimators are used which have a covariance matrix different from 
these but of the same order in N, a disturbance is added to X’ due to 
the resulting poorer fit. Such estimators might be used, e.g. in testing 
the fit of a Poisson distribution. In this case the sample mean would 
usually be used as an estimator and this is not the m.l. estimator based 
on the frequencies used in the goodness-of-fit test because of grouping. 
The effect in this case will be small for any sensible tail grouping; 
Chernoff and Lehmann [1954] have given a numerical example. 

That the effect of estimation may not be thought of as simply putting 
restrictions on the frequencies may be seen more simply another way. 
It is shown in Section 3, that, for known expected values and k classes, 
the distribution of X° when the frequencies are forced to obey r linear 
constraints is that of chi square with k — r — 1 degrees of freedom plus 
a positive quadratic form depending on the constraints. The effect of 
estimation on single degrees of freedom which, if the cell probabilities 
were known, would be approximately independent standard normal 
deviates will be felt unless their coefficients are orthogonal to the 
columns of the matrix B of (2.2). This condition is necessary whether 
the correct estimators (i.e.B.A.N. estimators based on cell frequencies) 
are used or not. It is therefore recommended, because of the above, 
that the correct estimators be used even though they will be more 
troublesome to obtain. They may be found by, if necessary, iterative 
solution of the appropriate (e.g.m.l.) equations with the simpler esti- 
mators as starting values. ‘This is less work than assessing the effects 
of alternative estimators by the analysis of this paper. 

We have been speaking above of “true multinomial situations.” 
The testing of the distribution of a continuous variate, given an un- 
grouped sample, may be reduced to this problem if its range is broken up 
into intervals independently of the sample. If the raw data zs a grouped 
sample, the problem is already in the required form. Chernoff and 


: oli 

: 

el 

b 

di 

to 

w 

se 

of 

el: 

in 

th 

du 

an 

el: 

be 

Se 

sh 

up 

W 

ex] 

| th 

int 

int 

int 

the 

titi 

fur 

of 

wh 

Cr: 

an 

pre 

fro 


ind 


CHI-SQUARE: GOODNESS-OF-FIT THSTS 443 


Lehmann [1954] have discussed this case too. It will be much more 
likely here that m.1. estimators based on the sample will be used instead 
of the correct estimators and the effect may be considerable for a small 
number of (or a poor choice of) classes, e.g., a doubling of the type I 
error. The use of the X’ test for this problem has been justly criticized 
because of the arbitrariness of the intervals to be used; there is no 
doubt that its use here is forced and not natural. Neyman’s smooth 
test and the Cramér-Kolmogorov-Smirnov tests have been introduced 
to overcome just this weakness. These tests however run into difficulties 
when there are parameters requiring estimation, although they are less 
serious when the parameters are those of scale and location. In support 
of the statement that the theory of X’ tests is basic, it will be shown 
elsewhere that the theory and application of the Cramér-Smirnov tests 
in the case of estimation may be treated using the results of this paper. 

A difficulty, not widely recognized, with the class intervals is that 
they will often be conditioned to some extent by the data so that the re- 
duction to the multinomial is no longer true. To take an extreme case, 
and the normal distribution as a particular example, the result that equal 
class probabilities are the most desirable to use may lead to intervals 
being set up with the sample mean and variance so this is so. In 
Section 4 the results of several writers on this problem are given. It is 
shown that the resultant distribution of X’ is chi-square with its classical 
number of degrees of freedom provided that, once the intervals are set 
up, the rest of the calculations are correct for the fixed interval case. 
With reference to the extreme example above, this would mean that the 
expected numbers per interval would be calculated using the class fre- 
quency m.l. estimators, not the sample mean and variances, and so 
they would only be approximately constant. This result justifies the 
intuitive procedure in large samples of ignoring the way the class 
intervals were found, assuming the method of their construction is a 
“fair” method like that in the theory. Methods like choosing the 
intervals to minimize or maximize X° are clearly not covered just as 
they are not when there is no estimation. 

In the final section (5) of the paper two general methods of par- 
titioning X° with a view to the likely alternatives are suggested for 
further study and application. Barton [1956] has considered the effects 
of estimation and grouping on Neyman’s smooth goodness-of-fit test 
which is based on the probability integral transformation as are the 
Cramér-Kolmogorov-Smirnov family of tests. On the null hypothesis, 
and in the absence of estimation and grouping, this transformation 
produces variates distributed uniformly, so that “smooth” deviations 
from the null hypothesis produce smooth changes in the uniform distri- 


_ 
| 
Be 
1 
| 
ne 
vices 
re, 
re | 
ve 
ti- 
ots 
” 
B.. | 
ad 
ed 
je 
| | 
| 


444 BIOMETRICS, SEPTEMBER. 1959 


bution. Changing the range to (— 1, 1), the Legendre polynomials 
arise naturally as the detectors of non-uniformity. Barton has general- 
ized Neyman’s method so that estimation and grouping are taken into 
account and it is recommended that his tests be used for general smooth 
alternatives where they will be more powerful than X*. If however 
more knowledge of the alternative is possessed, e.g. that the distribution 
is nearly a known distribution, it seems better to use either special tests 
(e.g. normal: those for skewness and kurtosis) and expansion of the 
unknown distribution in orthogonal polynomials with the known 
density as a weight function. If the data are grouped, this will lead to 
special partitions of X° and this is the suggestion of Lancaster. The 
method suffers from a lack of theory of the type provided by Barton 
for Neyman’s method but has the merit that its results have a ready 
interpretation. There is room for more work in this area. In particular, 
the ideas of this paragraph may be more fruitfully applied to the Cramér- 
Smirnov tests as will be shown elsewhere. 

In summary, the theory and application of X° to true multinomial 
situations leaves little to be desired. For continuous distributions the 
approach is less natural although, with a little more care than is usually 
used, it may be made to solve the main problems satisfactorily. The 
largest field of application—contingency tables—has not been treated 
here at all and badly needs a review. 


2. The Multinomial Distribution—A symptotic Theory 


Let N observations be drawn from a multinomial distribution of k 


classes with class probabilities p, , po , --- , p, such that the class 
frequencies aren, , % = N. It will be supposed that 
the class probabilities are functions of s parameters, , 0. , --- , 0, , 


sufficiently smooth to satisfy the requirements of Cramér [1954, para. 
30.3]. Cramér then shows that the maximum likelihood estimators, 
6, , --: , 6, , or any estimators asymptotically equivalent to them, 
satisfy 


1 ’B) BR’ 
= B's + (2.1) 


where 6, 6, x are column vectors of components 6; , @; , and [n, — Np, (6)]/ 
V Np,(6) respectively, where 


1 | seo 1 
B= os (2.2) 
LV pi (9) 06, J V p.( 


«s 


— 
= 
as 


CHI-SQUARE GOODNESS-OF-FIT ‘TESTS 445 
and where 0,(1/*VN) is any quantity which, when multiplied by VN, 
tends to zero in probability as N — . It will be noted that the ex- 
pected value of (B’B)~'B’z is zero so that E(6 — 0) >0asN > 
The same conditions ensure that the covariance matrix of 6 — @ is of 
order N~’. We will discuss later the effect of having estimators whose 
bias is O(N~'””) and also the effect of using estimators whose covariance 
matrix is 0(1/N). 

The next stage of Cramér’s analysis, referred to above, is to show 
that the terms in 


may be written 


— — ND (6, | | 
Yi = V/Np,(6) + o,(1), 


that is 
y =x — VNB(6 — 6) + 0,(1). (2.3) 


Now using (2.1), it is seen that 
y = [I — + o,(1). 


Since X° = y'y, the fact that the asymptotic joint distribution of the 
z, is multivariate normal with mean zero and a covariance matrix 
I — where (Vp)’ = [Vp, , , and B’V/p = 0, 
leads to X* being distributed asymptotically as a sum of squares of 
variables which are normal with zero means and the idempotent co- 
variance matrix 


I — Vp Vp’ — BB'B)'B’. (2.4) 


Since the matrix (2.4) has a trace of k — s — 1, the well known result, 
that X* is asymptotically distributed as a chi-square variate with 
k — s — 1 degrees of freedom, is proven. 

The simplicity of this result arises from being able to neglect certain 
terms and from the occurrence of the same matrix B in (2.1) and (2.3). 
Taking up the latter point first suppose that different estimators, 
6% , --- , 6% , say, had been used but that we can still have 


6* —9=- Aw + (2.5) 


with E(Az) = 0;"the covariance matrix of the @* will still be O(N" '). 


be 
0 
h 
T 
me 
S i 
n 
0 
I, 
T- 
al 
y 
he 
od 
at We 
ak 
ra. 
T's, 
m, 
Ee 
| 
Fe 
alt 
4 
apes 
ea 
| 


446 BIOMETRICS, SEPTEMBER 1959 


In this case, (2.3) is replaced by 


y = 2 — VNB(6* — 6) + 0,(1) (2.6) 
so that 
y = (I — BA)zx + 0,(1) 


and X’ = y’y is distributed as the sum of squares of variables which 
are asymptotically normal with zero means but with a covariance 
matrix V given by 


(I — BA){I — (WV — A’B’) 
which is equal to 
I Vp Vp’ — BAU Vp Vp’) — (I — Vp 
+ BAI — Vp Vp')A'B’. 


The latent roots \; of (2.7) are no longer obvious. They may however 
be studied using a device due to Barton [1956]. Defining »; = 1 — i, , 
the yu; are the latent roots of a matrix which may be written as 


Vp Vp’ + — Vp Vp’) 
+ — Vp Vp')A’ — 
where we have written the asymptotic covariance matrix of the 6* 
J* = AU — Vp Vp’) A’ (2.8) 


in terms of the information matrix for these estimators. Hence the 
problem of finding the latent roots of (2.7) is reduced to finding those of 


AI — Vp — B, (I — Vp — 
B’ 
i.e. of the (2s + 1) X (2s + 1) matrix 


(2.7) 


i, 0 0 
0, AB — 4J*"'J, — 2ABJ*" — 4 
0, J B’A’ — 


using again that B’~/p = 0 and writing B’B = J, the information 
matrix for the multinomial maximum likelihood estimators, 6. Thus 
k — 2s — 1 of the A; are unity, one is zero, and the remaining 2s are 
the latent roots of f 


| 
4 | 
I 
e 
te 
re 
li 
a 
? 
C 
al 
Cé 
al 
Se 
a 
aa 


CHI-SQUARE GOODNESS-OF-FIT TESTS 447 


J B’A’ — 


Hence asymptotically 


2s 
xX’ = Xh-20-1 + Lyi 

where the y; are standard normal deviates independent of xj_2,-; and 
each other. This result is intimately related to Theorem 1 of Barton 
[1956] on the distribution of Neyman’s smooth goodness-of-fit statistic 
when the parameters are estimated. 

If A = (B’B)"'B’ so that 6* = 6, BA = BJ*"B’, and J* = J, 
(2.9) becomes 


J, 


which may easily be shown (since it is idempotent and has a trace of s) 
to have s zero and s unit latent roots. We thus return to the classical 
result. In the most usual case where the correct (i.e. the maximum 
likelihood estimators derived from the class frequencies) estimators 
are not used, they are replaced by the maximum likelihood estimators 
computed from the ungrouped sample data. In this case the results of 
Chernoff and Lehmann [1954] imply that s of the roots of (2.9) are unity 
and the other s roots are all between zero and one. For example, in the 
case of a continuous distribution with density function f(z, 6, ,--- , 4.) 
and fixed class boundaries —-» < Z, < --- < Z,_, < © it is easily 
seen that 


1 of dr 
a6, 8) 96, 
A= J*' 
= J*"R’, 
and 


| 
Bee 
per 
| 
) | 
| 
th 
4 
3 
thy 


448 BIOMETRICS, SEPTEMBER 1959 


Thus AB = J*"'J and to find the latent roots it is better to return 
from (2.9) to (2.7) which becomes simply 


I — Vp Vp’ — BJ*"'B’. 
The latent roots of this matrix are k — s — 1 unities, one zero and the 
roots of the determinantal equation 


| J —(1—A)J*| =0, 


which lie between zero and one because the 6* contain more information 
than the 6. Hence 


+ Davi, 
1 


the result of Chernoff and Lehmann and the analogue of Theorem II of 
Barton. A similar result is obtained when a discrete distribution, such 
as a Poisson distribution, is grouped. 

Thus the distribution of X’ when the correct estimators are not used 
is in general not a x’ and contains the unknown parameters 6. Nothing 
is known of the magnitudes of the 2s latent roots \, in the most general 
case where they must be found by (2.9). When the multinomial has 
been obtained by grouping a discrete or continuous distribution and the 
sample maximum likelihood estimators are used, s of these roots are 
unity and the other s roots will decrease as the grouping is made finer 
and finer since then J — J, so that finally X* ~ x?_,_,. In the general 
case one would presume that the roots of (2.9) fall usually into two groups 
—-s roots near unity and s roots near zero. It follows from (2.15) below 
that the sum of the 2s roots is always greater than s. 

We see that (2.3) would still be true for estimators which converge 
more rapidly to. 6. But if (6* — = 0,(1), y is asymptotically 
equivalent to x, so that X* ~ xj_, , i.e. estimation of parameters has 
no effect at all on the distribution of X*. Proceeding in the other direc- 
tion suppose that the estimators 6* of type (2.5) with the additional 
term bN~'”’—..e. the E(6* — @) is not o(N~'””) as before but O(N ~'”?)— 
and a covariance matrix of O(N~') as before. Then it follows from 
(2.6), that the y still have an asymptotically normal distribution with 
covariance matrix (2.7) but with a non-zero mean vector—( vp) Bb, 
where is a diagonal matrix of elements Vp, , --- , Vp,. Thus 
X’ tends to a non-central quadratic form. This is the full analogue of 
Barton’s Theorem 1, except that he deals with the non-null case. 

We may now discuss the partitioning of X’, i.e. we wish to consider 
linear forms in the frequencies 


k * 


SO 


C 
th 
n 
H 
el 
re 
m 
al 
m 
a al 
> 
la 
L 
ge 
Ww 
th 
es 
th 
lo 
us 


= 


iC 


CHI-SQUARE GOODNESS-OF-FIT TESTS 449 


the sum of whose squares adds up exactly to X*. Choosing 
Ly = V p.(6*)(l = 1, , k) 


it is seen that L, = 0 identically. If the other forms are chosen to be 
normalized and orthogonal to L, , i.e. so that, if = [Li,, , Dial, 


L{Vvp) =0 
LiL; = 
then 
(2.12) 


The conditions (2.10) mean that the coefficients L,, are functions of 6*. 
However, for the same reason as we may take the denominators of 
elements of X” to be their true values, we may suppose, for asymptotic 
results, that the L;, take their true values, i.e. the first equation of (2.11) 
may be taken as )> Li: Vp,(6) = 0. Thus the joint distribution of 
any subset of the L; is asymptotically multivariate normal with zero 
means and, [assuming that 6* is the estimator first envisaged in (2.5)], 
with covariances given by 


cov (L; L;) LiVL; 
Le. 
cov (L; , L;) = 6;; — LiBAL; — LiA’B’L,; + LiBJ*"'B'L; (2.13) 


after using (2.7) and (2.11). Since (Wp, , --- , Wp,) is a latent 
vector of V, the coefficient vectors L; can be taken, if suitable, as the 
latent vectors of V. If such a choice is made L‘VL; = 0 (¢ ¥ 3) and 
LiVL; = \,; , the latent root of V associated with L; . Hence, in the 
general case, k — 2s — 1 forms will have unit variances and 2s forms 
will have variances \, , --- , A2, Which are the roots of (2.9). In fact 
the last 2s forms can be used “‘to partition out” the disturbance due to 
estimation because, if the sum of their squares is subtracted from X’, 
the result is a statistic which is distributed as x~_,,_, . However the 
loss of degrees of freedom is s more than it need be. 

Suppose now that the last s forms are defined by the linear forms 
used in (2.1), namely by 


so that, as required by (2.11), L’L = I, and L’ Vp = 0. Then 
L'VL = 


iy 
) 
) 
| 
f 
| 
| 
t 
| 
n 
h 
| 
an 
| 
= 


450 . BIOMETRICS, SEPTEMBER 1959 


whose trace is given by 
s — 2trace (AB) + trace (JJ* '). 
But 
trace V = k — [1 + 2 trace (AB) — trace (JJ*~')] 
= (k — s — 1) + trace (L’VL). 


If then forms L,,---, Ly-,-1 are constructed so that with the above 
definition of forms L,_, , --- , Ly-1 , (2.11) is satisfied, it is seen that 


k—s-1 


> var (L) =k—s-—1, 


(2.15) 


so that it would appear that X° — >o*7}_, L? = Li + --- + Lig. 
is now distributed as xj_,_, , i.e. that the disturbing effect of incorrect 
estimation has been partitioned out. For this to be true we must show 


that a typical pair of vectors L, , L; (7,3 = 1,---,k — s — 1) satisfying 
(2.11) and B’L,; = 0, 
i.e. B’L; = 0 (2.16) 
also gives LivVL; 63; (2.17) 


But this is seen to be true from the form (2.7) for V. 

Since only k — 2s — 1 latent roots of V are necessarily unity, it 
seems strange that the disturbing effect of incorrect estimation on X’ 
can be partitioned out with only s linear forms. However, this choice of 
the last s degrees of freedom served as the basis for Fisher’s [1928] most 
general account of the effect of estimation on the distribution of X’. 
It can be shown that the subtraction of these terms is equivalent to 
calculating X’ with the estimators which result from one cycle of iteration 
to the maximum likelihood equations, using the estimators 6* as the 
starting values. This works because in large samples one iteration 
should be enough. 

This completes the asymptotic investigation of the effect of esti- 
mation on the partitioning of X’. If the linear forms of interest satisfy 
(2.11) and (2.16), they may be taken, on the null hypothesis, as in- 
dependent standard normal deviates. If this is not so, (2.13) must be 
used to compute their variances and they may not be independent. 
When the correct estimators are used so that 6* = 6, (2.13) reduces to 


Cov (L,; , L;) = — , (2.18) 


so that it is still necessary to insist on (2.16) being satisfied if the linear 
forms are to have unit variance. 


\ 
( 
tl 
4 a 
li 
I 
b 
Sé 
i=1 
sl 
si 
H 
us 
be 
of 
et 
di 
Gi 
wl 
TI 
sau 
tic 
4 in 
| 
4 (e. 
| 
| 
4 
\* 


CHI-SQUARE GOODNESS-OF-FIT TESTS 451 


The theory of this section is adequate to deal with true multinomial 
situations, cases when a discrete distribution has been grouped (for 
example, the Poisson distribution), and the case of a continuous dis- 
tribution which has been grouped by using intervals that are fixed for 
all samples. Chernoff and Lehmann [1954] give two worked examples 
showing the effect of using sample rather than frequency maximum 
likelihood estimators when testing normal and Poisson distributions. 
In the first case the effect may be large and in the second case it will 
rarely be appreciable. But, in fact, one should never use such estimators 
hecause it is not very troublesome to find the correct ones, using the 
sample maximum likelihood estimators as starting values in the iterative 
solution of the frequency maximum likelihood equations. Cochran 
(1954, 1955] has considered the variances of linear forms in the class 
frequencies. His formulae are particular cases of those given here. It 
should be noted that his “variance chi-square” is not of the type con- 
sidered here and so the above remarks on estimation do not apply to it. 
However in his ordinary goodness-of-fit X’, it does appear as if he has 
used the sample estimators though, in his cases, the perturbation would 
‘be small. 


3. Multinomial Distributions—M ore Exact Theory 


The above results are all asymptotic. For finite N, the distribution 
of X* or any part of it is discrete and depends on the unknown param- 
eters even if maximum likelihood estimation is used. The latter 
difficulty may be overcome if 6, , --- , 6, admit sufficient statistics. 
Generalizing a result of Gani [1955], we may suppose that, for s < r < 
k — 1, the class probabilities have the form, 

a, [a,(@)]*" 
ax, 


t=1 


where it may, without loss of generality, be assumed )-*_, a, = 1.* 
Then the joint distribution of n, , --- , n, given K, = = Kum, °°*, 
K, = > K,.n, is the same as if the n; , --- , n, were drawn, under the 
same restrictions, from the multinomial population with class probabili- 
ties a, , +++ ,a@,. Since the Ky, , Ky. , (l = 1, ,k) are 
independent of 6, , --- , 6, , the conditional distribution of any function 
(e.g. X*) of ny, +++ , 1% for given values of K, , --- , K, is independent 


*A more claborate discussion could be given using recent results of Halmos and Savage [1949]. 
(For an elementary exposition, see Watson [1957a] but this would not be in keeping with our present 
point of view.) 


} 
| 
mee 
) 
| 
t 
g 
We 1) 
3) 
7) 
it 
of 
72 
to 
on 
he 
on 
| 
ty 
in- 
be 
nt. 
18) 
‘ 
| 
pan 


452 BIOMETRICS, SEPTEMBER 1959 


of 6,,---,96,. The restrictions may so drastically limit the number of 
possible samples that it may be easily possible to compute, for each such 
sample, the value of the statistic and the probability in the conditional 
distribution and so to perform an “exact” test in the same way as is 
done in the well known case of 2 X 2 tables. Such tests are “‘similar” in 
the Neyman-Pearson sense since the Type I error rate will not depend 
on the unknown parameters. Unless some modification, such as random- 
ization, is used, discreteness will prevent the attainment of a preassigned 
Type I error rate. 

If the number of possible values of the statistic is too large for an 
enumeration even of the “more significant samples,” it seems worth- 
while trying to approximate the conditional distribution of the statistic 
for N large. It seems to be assumed in the literature that the con- 


ditional distribution of 
x*= Np.(6) (3.2) 
= X ML , Say, 


given K,, --- , K,, will be asymptotically x{_,_, but no previous proof 
of this is known to the writer. Writing K’ = [K,, --- , K,], the same 
applies to an alternative statistic 


In, — | _ 2 
x3 (3.3) 


mentioned by Rao and Chakravarti [1956]. The form of the con- 
ditional distribution of the n,; , --- , m% given K,, --- , K, , suggests 
the much simpler statistic 


xt = Nay (3.4) 


The stati&tic (3.4) appears superior because it does not require the 
solution of the maximum likelihood equations, as does (3.2) and does 
not require the evaluation of the E(n, | K), as does (3.3). 

The evaluation of the E(n,; | K), or more generally the expectation 
of any function ¢ of the n,’s may be carried out by a generalization of 
the method of Rao and Chakravarti. From (3.1), the probability of 
observing K, P,(K) say, is given by 


 ... 


a; 


- P\(K) = P,(K) (3.5) 


where 


we 


é 
| 
4 th 
di 
: 
Si 
It 
sil 
ny 
: dis 
ne 
‘ 
| 
cou 
tha 
Th 
Thi 
tha 


2) 


CHI-SQUARE GOODNESS-OF-FIT TESTS — 458 


N! , 
P\(K) = (3.6) 


the summation being over all samples which give simultaneously 
Kum = Ki, , 2) Kan = K, . Writing E(¢ | K) for the con- 
ditional mean of ¢ given K, 

| K)PMK) = E@), (3.7) 
where the summation is over all possible values of K. Combining (3.7) 
with (3.5) and rearranging, we find 

Since the K,, , --- , K,, may be taken as integers, it follows that 


It will, however, often be difficult to evaluate (3.9). 

The asymptotic distribution of (3.4) may be determined by methods 
similar to those used in Section 2. It may be assumed that for large N, 
m Kym, , , Kym: have a multivariate normal 
distribution with means and covariances determined from the un- 
conditional multinomial distribution (a, + --- + a,)*. It is also 
necessary to consider a sequence of values for K which converge to 
E(K) at the proper rate. Since 


E(Ki) = N Kia , 
cov (K; , K;) MDX KK = (>. K (3.10) 
= NV,; Say, 


we will assume that the sequence of values is such that 
[K — E(K)]'V"'|[K — E(K)] (3.11) 


converges to a non-zero finite limit. Unconditionally, the quantity 
(3.11) is asymptotically a x? variable. On this basis, it is easy to show 
that, asymptotically, 


~ + — E(K))'V"'[K — E(K)]. (3.12) 
This suggests that in practice the statistic 
Xe = — N'[K — E(K)|'V"[K — E(K)] (3.13) 


should be referred to the x° tables with k—r—1 degrees of freedom. 
This result is of some general interest. Roy and Mitra [1955] mention 
that Mitra has found a case where (3.4) is not distributed as a x? 


d 
IC 
& 
ne 
3) 
Ae 
ad | 
A) 
the 
| 
10n 
of 
4 
4. 
| 
| 


454 BIOMETRICS, SEPTEMBER 1959 


Our result is a possible answer to their problem. 
The result (3.12) may be rewritten, by analogy with the identity in 
regression analysis, 


Total S.S. = Error 8.8. + Regression 8.5. 


To find a vector 8, so that n, — Na, — 6i[K — E(K)] is uncorrelated 
with K — E(K), we must solve the equations 


E{{n, — Na,][K — E(K)]’} = BiVN. (3.14) 
But 
= Na(K, — 
so that 
Bi =a(Kiy — Kum (8.18) 
Now 


— Nai)’ _ (n, — — E(K)] 
2 Na, Na; 
— E(K)]} 
2 Na, 
-> ~ NIK — E(K)!'V"[K — E(K)] = Xt. 
Thus we have shown that 
E(n, | K) ~ Na: + 6i[K — E(K)], (3.16) 
and that 
x? > — E(n, (3.17) 


Na, 


However the last term of (3.16) is O(+~/N) so that if (3.16) is used in 
(3.3), the E(n, | K) in the denominator may be replaced asymptotically 
by Na; so that (3.3), ie. X32, becomes equivalent to (3.17), ie. X¢ 
and so is also asymptotically xi_,_, . It is worth noting that this result 
implies the usual X? statistic for two-way tables is asymptotically x 
when the marginal totals are kept fixed. ; 

It remains to verify that X%,, is asymptotically x;_,-, in the 


80 tl 


| 

cou 

yy 

and 

The 

the 

dep 

sinc 

2 

Xk-s 

the 

hav 

dist 

one 

gate 

Thi 

Kip 

| Sup 

= 

and 

ing 

It is 
Vy 


ar 


16) 


CHI-SQUARE GOODNFSS-OF-FIT TESTS 455 


conditional distribution with K fixed. However, a stronger result is 
true. Returning to the beginning of Section 2, it is seen that Xi, = 
y'y where the elements of the vector y given by 


y = [1 — B(B'B)"B' Ix + 0,(1) 


are asymptotically normal and have an idempotent covariance matrix 
and zero means. Moreover the asymptotic joint distribution of y and 
VN(6 — 8) is multivariate normal since, by (2.1) 


V/N(6 — 0) = (B’B)'B’x + 0,(1). 


The asymptotic distribution of X%,, given 6 must be computed using 
the distribution of y for fixed 6. But y and 6 are asymptotically in- 
dependent since 


ElyN(6 — 6)’] 


- 
— B(B’B)"|\I — Vp 
=— 0, 


since Vp'B = 0. Hence X%,, is unaffected by fixing 6, and so is still 
Xi-»-1 - Here it is not assumed that the p,(@) are defined as in (3.1)— 
the result follows from the fact that maximum likelihood estimators 
have the sufficiency property in large samples. 

Thus if r = sand N— X4,,, Xz and X¢ are equivalent and are 
distributed as x{_,-, . They may however differ in small samples and 
one may be better approximated by xi_,_, than the others. To investi- 
gate this, we consider one of Gani’s examples: 


9, poe=20, pp=1—30, 


This has the form (3.1) with s = r = 1, a, = }, a2 = 3,0; = 3, Ku = 1, 
Ky = 1, Kis = 0. Thus K, = K = n, + n., (6) = O(1 — 30)". 
Suppose that we wish to make an X” goodness-of-fit test of the sample 
tl = 2, m2 = 9, as = 9 80 that N = 20, K = 11. In the conditional 
distribution the possible samples are (0, 11, 9), (1, 10, 9), --- , (11, 0, 9) 
and the associated values of X%, X%,, and P, the probability of obtain- 
ing the sample from the conditional distribution, are given in Table 1. 
It is easily verified that here 


K 


E(n, | K) = Np,(6) = 3° E(n, | K) = Np,(6) = 


2k 
3 


0 that X; = 


| 
| 
1 
ip 
1 
| 
ae 
“bo 
a 
~ 
in 
ally 
sult a 4 
, 2 
x 
the 


BIOMETRICS, SEPTEMBER. 1959 


TABLE 1 
ComPARISON oF THREE X? Statistics IN SMALL SAMPLES 
m Ne ns X% XML Xi» 
0 11 9 O1l 556 4.033 5.500 8.920 
1 10 9 063 636 2.133 2.909 3.605 
2 9 9 158 943 0.833 1.136 1.262 
3 8 a] 238 396 0.133 182 0.188 
4 7 9 238 433 0.033 046 0.046 
5 6 9 166 888 0.533 727 0.693 
6 5 9 .083 444 1.633 2.227 2.079 
7 4 9 .029 797 2.333 4.546 4.204 
8 3 9 007 450 5.633 7.682 7.119 
9 2 9 001 242 8.533 11.637 10.966 
10 1 9 000 259 12.033 16.409 16.081 
11 0 9 000 006 16.133 22.000 24.169 


Thus the chance of X% exceeding the tabular 5% (1%) point of xj , | 


3.841 (6.635) is 0.0205 (.0015) whereas for X%,, the corresponding 
figures are 0.0504 and 0.00895. So the X@ statistic is inferior to the 
X‘ux Statistic. For comparison the likelihood ratio statistic 


has been computed and is seen to be very similar to X%y, . 

Finally we mention approximations to exact unconditional dis- 
tributions. Hoel [1938] first found an asymptotic expansion for the 
distribution of X°. Darwin [1958] has reconsidered the problem and 
tried to find an asymptotic expansion with discontinuous factors. Good 
[1957] has given a general method and a number of results in this 
difficult field. A much simpler task is to find an asymptotic expansion 
for the distribution of a linear function of multinomial frequencies. 
Cochran [1956] has given a fully worked numerical trinomial example. 
From his results it is clear that the normal approximation is surprisingly 
good for the two-tailed test and that its inadequacy comes more from 
the discontinuities of the true distribution than from its failure to 
follow the overall pattern of the true distribution. Corrections could 
easily be made for skewness and kurtosis by using a series in the first 
four Hermite polynomials. A proper treatment requires a study of the 
relation 


Prob (> an, = 4) = [ dd, 


— 
Cl 

456 

‘2 

wl 

or 

4. 
de 
w 
ra 
< 

fo 

co 
re 
pr 

an 

5 > in’ 

2 l 

= 2 2, n, log. | th 

Np,(@) 

fai 
in’ 
ho 

tu 
an 

Bs 
an 

TI 


e 


CHI-SQUARE GOODNESS-OF-FIT TESTS 457 
where the a, and A are all integers and C is a closed curve about the 
origin and on which | \ | > 1. 


4. Continuous Distributions 


(Given a sample x, , --- , ty from a continuous distribution with 
density f(x, 0, , +--+ , 6.) = f(x, 6), the parameters 6, , --- , 6, being 
unknown, an X* goodness-of-fit test may be made in several ways. The 
range of a may be split up into k segments by g, = — © <g, < 
< e-1 < gx = + © and the number of 2x; in each class found to be 
Mm, M2,°** say. Estimators 6% , --- , 6% for 6, , --- , 8, may be 
found in some way and 


In. — Np.l 
-2 Np,(6*) 


computed. If the class boundaries g, , --- , g,-1 are fixed without 
reference to the sample, the class frequencies are multinomial with class 
probabilities 


pio) = (4.2) 


and the theory of Section 2 may be applied. In practice the class 
intervals are to some extent the result of an examination of the data, 
though it is perhaps not often that they are strictly defined by it, a 
fact that is usually ignored. It may even be possible to define variable 
intervals so that the distribution of X’, using the sample maximum likeli- 
hood estimators, is asymptotically free of the unknown @—it is in- 
tuitively evident that this is so when the parameters are those of scale 
and location. This question has been examined by A. R. Roy [1956], 
G. S. Watson [1957, 1958].* As Roy’s treatment is both more general 
and more rigorous, it will be sketched below. 
We will use the notation that, for a function h(@), 


ah(é) 
| _ an) | | 
ah(6) 


The result of Cramér [1954, ch. 32] for regular maximum likelihood 


*M. C. K. Tweedie has also sketched a gencral theory in the Discussion to Watson [1958] which 
contains a ber of ts on this material. 


| 
1 
x 
= 
g 
_| 
4) 
d 
4 
mn 
ly 
m 
to 
a8 
ld 
he 
‘ 
af 
| 


458 BIOMETRICS, SEPTEMBER, 1959 


estimators may then be written, analogously with (2.1), 


and where J is the information matrix with (7, j) element 
pl log fle, | 
Ji, = - 30,88, (4.4) 
It is well known that 


Where 0 is a zero vector, Roy considers, analogously with (2.5), the 
class of estimators defined by 


x h(ta) + o( =) (4.6) 
where h(x.) is a vector valued function of x with s components such that 
Ef[h(x)] = 0 (4.6a) 

and such that 
= LU, (4.7) 


the s X s covariance matrix of the components of h(x). Clearly (4.6) 
reduces to (4.3) by putting 


h(a) = log fla, 6) (4.8) 
and then 
(4.9) 
As in Section 2, the investigation consists of showing that the 
n, — Np,(6*) 
(4.10) 
V Np,(6*) 


have a joint multivariate normal distribution with zero means and 
finding their covariance matrix. As before the p,(6*) in the denominator 
of (4.10) may be replaced by p,(6) and the p,(6*) in the numerator 
taken to be 


a0 


+ — 6)’ (@—), (4.11) 


wl 
2 ‘ 
: i.e 
06 ml 
pre 
Co 
res 
bet 
Th 
ma 
nec 
(4. 
elir 
nor 
ind 
sun 
cen 


9) 


CHI-SQUARE GOODNESS-OF-FIT TESTS 459 


where, by (4.2), with g, replaced by g,(@) 
ap.( 8) re ay + g) 


flgrs(@), 6) 22= 


= u,(6) + 0,(8), say. (4.12) 


The n, are the number of zx,’s in [g,_,(6*), g.(6*)] and are no longer 
multinomial though their dominant part, the number of z,’s in 
(g:-1(9), g.(@)], m, say, clearly is. The crux of Roy’s method is his 


proof that 
— m, = — + 0, (JN). (4.13) 


Considering the case of @ scalar (s = 1) to see the plausibility of this 
result, it would be reasonable to suppose that the number of z,’s 
between g,(6) and g,(6*) is approximately binomial with mean 


N | (6* — 6)gi(4) | 4]. 


The result would then follow from considering the normal approxi- 
mation to binomial distribution wherein the probability of success is 
O(N”). This is the only point in the argument where it seems really 
necessary to assume that the random variable z is a scalar. 

The rest of the discussion is similar to that of Section 2. The y;, of 
(4.10) is simplified by using (4.13), (4.11), (4.10) and then 6* — @ is 
eliminated by (4.6). The only additional task is the proof of asymptotic 
normality. This is achieved by expressing the m, as the sums of N 
indicator variables so that the column of y,’s is finally expressed as the 
sum of N independent vectors and hence a multivariate form of the 
central limit theorem gives the result. Writing 


| p,(8) 
p=| |, w= de, 
|p. (8) 


| 
) 
\ 
0) 
nd 
Lor 
tor 


460 BIOMETRICS, SEPTEMBER. 1959 


Vv = [w,(0), . 0], = 
=P pr’ -Vw-wut+w Du, 

the final result is that X? is asymptotically distributed as )** \,2? where 
the are the latent roots of The discussion of the 
roots of P~'’? }°>* P~'” is the same as that of (2.7). If the sample 
maximum likelihood estimators are used, so that (4.8) and (4.9) are 
satisfied, the s non-negative roots are between 0 and 1. They are 
usually functions of 6, but if the parameters are those of scale and 
location, it is easily shown that this is not so when the functions g,(6) 
are chosen such that the p,(@) are constant. 

This however does not represent the solution to the practical problem 
which may be stated better as follows: what is the effect on the dis- 
tribution of X’, calculated by using maximum likelihood estimation 
with the class frequencies, when the class boundaries are to some extent, 
conditioned by the sample? If we can show that X”, so calculated, is 
asymptotically x{_,-, then the intuitive procedure . e. ignoring the 
variability of the class boundaries) is proved to be valid. 

For this purpose, we may take the formulation of Roy and use the 
class boundaries g,(6*)._ However with the class frequencies so obtained, 
nm, ,°*** , 2”, , We would, in the intuitive procedure, carry on as follows. 
Define 


91(0*) 
pd, = [ dr, (4.14) 


and then solve the equations 


k 
nN, op,(6, 6*) 
= @ 4, 
a0 


for 6, obtaining @ = 6%, say. Finally 
k 2 
— Np,(6F , 6*)] 
2 Np, (67 6*) 


could be computed and treated as xj_,-, . The asymptotic distribution 
of (4.16) can be easily found by a trivial variation of Roy’s discussion. 
In the present case, (4.13) remains true since only 6* is relevant here. 
However (4.11) must be replaced by 


Op, , 
, = p,(6, 0) + (OF — = 0, 0° <0 


(4.16) 


(4.17) 


* * . 


B 
Bi 
th 
Wi 
re: 
TI 
th 
we 
pr 
pr 
int 
th 
of 
dif 

hay 
alt 
ott 
asy 
for 
pre 
to 
cul 
Ne 
by 
shi 
son 
con 


CHI-SQUARE GOODNESS-OF-FIT TESTS 461 


By consideration of (3.14) and by comparison with (4.12), (4.17) becomes 

, = p,(O, 6) + (OF — + — (4.18) 
so that finally 

n— Np, ( 6¥ , m, — Np.(6, 6) — — 

V Np.( 6 , 6*) V Np.(6, 8) 
But this is precisely the expression which would arise if one began with 
the class boundaries g,(@), found the m, , then forgot the true value of 
§ and proceeded to estimate and calculate the X’ statistic in the correct 


way for fixed class intervals. So the intuitive method leads to the 
result that 


r2 2 
X ™ Xk-s-1+ 


This last stage of the proof can, of course, be carried out formally by 
the methods given above. 

The generality of the above argument seems to mean that in practice 
we may form our class intervals with full knowledge of the data and then 
proceed in the classical manner as though these intervals were known a 
priori. This seems intuitively plausible since any fixed set of class 
intervals leads to the same asymptotic distribution in the null case, 
though in the non-null case the powers will be different, and in the case 
of small samples the approximation to the x°-distribution will be very 
different. 

There is therefore no need to go beyond the theory given in Section 
2—in particular, the partitioning of X’ requires no further discussion. 


5. Some Metrical Aspects of X° Tests 


In the preceding sections, only distributions on the null hypothesis 
have been considered. X°’ is not designed for specific alternatives, 
although partitions of X’—or non-independent parts of X’—are used 
often to test particular deviations from the null hypothesis. The 
asymptotic distribution of X* may be considered (see Cochran, [1952]) 
for general alternatives whose class probabilities differ from the null 
probabilities by terms of order N~'’*. This result could be sharpened 
to take account of more specific alternatives and of estimation diffi- 
culties, but it would then resemble very closely the discussion of 
Neyman’s smooth goodness-of-fit test given, in a fine series of papers, 
by Barton [1953a, 1953b, 1955, 1956]. In view of the intimate relation- 
ship of Barton’s results with those above, his work will be described in 
some detail. It will be possible, using the final result of Section 4, to 
complete Barton’s Theorem III. 


oad 
| 
| ik 
| 
i 
ry 
) . 
= 
1 
+4 
Biber 


462 BIOMETRICS, SEPTEMBER. 1959 


The family of distributions envisaged by Neyman and Barton is 
defined by 


K | 
1 —= 5.1 
| f(x, 6, ) (5.1) 
where the A, are constants, not depending on 6, , --- , 6, , where the 


x,(z) are Legendre polynomials standardized on the interval (— 3, 3), 
and where 


The null hypothesis is defined by A, = --- = Ax = 0; if it is true the 
probability integral transformation (5.2), when applied to a sample 
21, °** , ty Will lead to a set, z, , --- , 2y , uniformly distributed on 


(— 4, 3)—provided @ is known. Since the z,(z) are smooth functions 
of z (and so of x),a small value of K and small d, will mean that (5.1) 
differs smoothly from the null hypothesis. For 6 known, it is easily: 
seen that u, = 2,(z.) (r = 1, , k) are asymptotically 
uncorrelated normal variates with unit variances and means d, . Hence 
vz = >*, y? is a good statistic for testing 4. = --- = Ax = 0, since it 
has a non-central x’ distribution with parameter . 

When @ is unknown, and estimators 6% , --- , 6% are used in (5.2) to 
find z* , --- , z§ and hence ¥%’, Barton shows that, even on the null 
hypothesis, the distribution of ¥#’ is not x{_,_, , as might be supposed, 
at least for maximum likelihood estimators, but of a form close to that 
of Section 2. The latent roots involved will be functions of the unknown 
parameters except when they are location and scale parameters. Because 
the distribution of y#’ is never simple, it is not likely to enjoy much 
practical application. 

Although one of Neyman’s aims in suggesting this approach was to 
avoid the arbitrariness of class intervals, they are undoubtedly con- 
venient with the large samples that are likely to be used in these tests. 
Moreover, it will appear from our extension of Barton’s Theorem III, 
that they enable us to get a test statistic with a workable distribution. 
As with the discussion in Section 4, two types of grouping may be used. 
Barton uses the terms pre- and post-grouping. 

In pre-grouping, fixed class boundaries are used—say, — © < g, 
< +++ < gy-1 < ©—+to divide the range of x into k intervals. If £, is 
the median of the /th class, all the observations in that class will be 
considered as having this common value £, . Writing .p, for the prob- 
ability constant of the lth class, ¢, , which replaces z in the ungrouped 


T 


( 
( 
tl 
h 
a 
d 
| 
| li 
th 
fc 
a 
fc 
4, 
is 
es 
sa 


CHI-SQUARE GOODNESS-OF-FIT TESTS 463 
case, is defined by 


(5.3) 
2 
The generalization of u, is 
1 k 
u, = VN (5.4) 


where n; is the frequency in the Ith class and P,(f,) are orthogonal 
polynomials replacing the ~N,(z). If the P,(¢,) are defined to satisfy 


Pf.) = 1, 
(5.4a) 


then it is easily seen from Section 2 that, when @ is known, and the null 
hypothesis is true, 


K 
vi = ur ~ Xk (5.5) 


as N — o and where K < k — 1. Barton however finds the non-null 
distribution of yx and discusses the effect of estimation on this procedure. 
His main result, for our purposes, is that if @ is replaced by its maximum 
likelihood estimator @*% based on the class frequencies, the null dis- 
tribution of y#* is x;_, . This corresponds to the well known result 
for 

By post-grouping, Barton means that the group boundaries are 
adjusted to give constant estimated class probabilities. His theorem 
for this case may be extended and completed by assuming, as in Section 
4, that the new class intervals are defined by functions g,(@*) where 6* 
is some estimator of 6. Suppose now that we proceed as in the pre- 
grouped case. The class probabilities will be taken as (4.14) and the 
estimator 6% found from (4.15). Instead of (5.3), we will write 


* 
where 

pit = p.( 67, 6*). 
The orthogonal system P,(¢*) will be found as before so that (5.4a) is 
satisfied but, in doing so, 6* must be taken as an adjustable parameter 


| 
) 
mn 
3 he 
y: 
: 
t 
ill 
at 
se 
| 
ch 
ts. : 
mn 
ad 
AU. 
: 


464 BIOMETRICS, SEPTEMBER 1959 


so that (5.4a) is true for all 6*, Finally, we may define 


and 
ve = > (5.8) 


To find the asymptotic distribution of y#*, it is merely necessary to 
examine (5.7). Only a sketch will be given since the methods required 
are now familiar. The n, will again have the representation (4.13) 
while, to the order required, 


Pst) = + G Ht 
| (5.9) 
art 
where unstarred symbols are those which result from putting 6* = 
6% = 6. In the notation of (4.12), 
| 
[2 p> + 2 (5 10) 


Using (4.13), (5.9), (5.10) in (5.7) and retaining only terms O,(1), we 
find that 
-o( 


k , m, OP, 


(5.11) 


The first term in (5.11) is that which appears in the pre-grouping case 
using intervals g,(@). The second term tends to zero in probability 
since-/ N (6* — 6) = O,(1) and the other factor has a variance which 
is O(N~') and a zero mean. To see the latter point, the mean in question 
is 


But the relations (5.4a) now hold for all 6* and when the identity 


+ 


In 


| 
| 

gl 

th 

tk 

si 

rel 

8a 

et 

fo 

re 

ck 

in 

: th 

he 

B: 

Ww! 

C; 

As 

: da 

(F 

of 


CHI-SQUARE GOODNESS-OF-FIT TESTS 465 


a4 piP,(f.) = 0 is differentiated with respect to 6*, the left-hand 
side, with 6* = @ is simply (5.12). Thus only the first term of (5.11) 
contributes to V* which then has the same distribution as in the pre- 
grouped case. It follows then that y%’ is again x;_, on the null hypo- 
thesis. This result corresponds to the last result of Section 4, and provides 
the solution to a slightly more general problem than that posed by Barton 
since the estimated class probabilities are here no longer required to be 
constant. Again we interpret this result as meaning that grouping the 
sample according to rules based on the sample estimators of the param- 
eters has no influence on the asymptotic distribution of Barton’s grouped 
form of Neyman’s statistic. Finally, we remark that the U*’ are 
partitions of the X’ of Section 4 (in particular y#?, = X°) and that this 
result could have been obtained from the theory given there. But to 
check that this was so would have taken as much space as, and been less 
informative than, the direct argument given above. 

The Barton-Neyman test has much to recommend it—greater power 
than X’ if K is small—with almost equal ease of application. In the 
hope of encouraging its further investigation for practical use, we quote 
Barton’s formulae for P,(¢), P2(¢): 


PAS) = V3 [28], (5.13) 
PAS) = V5 [6¢° 3(1 — Sos) + 
where 
= tipi ’ 
(5.14) 


Cy = 2(4 3608,, 9So; 10So3 = 


As Barton has given no worked examples, his test will be applied to the 
data in Cramér ({1954], Table 30.4.3), on the breadths of 12000 beans 
(Phaseolus vulgaris). It may be shown here that, with a null hypothesis 
of normality, 


= 3.49032¢, , 
Pf.) = 17.59329(6¢? — 0.492543 + .000006¢,), 
U, — 2.5503, U2 — 10.7979, 


so that 


ll 


Ui = 6.5, U? = 116.6. 


In this example there are 16 classes so that the effects of differing 
estimators and the method of determining the class boundaries may be 


AS 
| 
| 
7 
a 
| 
ale 
1 
| 
4 
se 
Sg 
ch 
ae 
i 
| 
fi 
4 
| 


466 BIOMETRICS, SEPTEMBER 1959 


neglected. J urther, since N is 12000, we may assume that Cramér’s 
X* = 196.5 is accurately a chi-square with 13 degrecs of freedom. 
Uj and U; are components of this chi-square, cach with one degree of 
freedom, which are both significant, U} very much so, but the remainder, 
with eleven degrees of freedom, 


X’? — Uj — U; = 73.4 


is still significant. Cramér has shown by fitting an Mdgeworth series to 
this observed distribution that the introduction of the term in g, alone 
reduces X” to 34.3 while the inclusion of the further term in g» (and g,) 
makes X” only 14.9, a non-significant value. There are several lessons 
to be learned from this example. Cramér’s analysis shows that the 
deviation from normality is “smooth”; in fact, that all but 34.3 of X? 
is due to skewness. Barton’s components Uj , U? --- have no such 
immediately helpful interpretations and they may not always be in 
decreasing order of importance. Thus we have the situation, common in 
statistics, of having a test which would be ideal if we knew how many 
components should be used (i.e. the value of K) falling down in practice 


where we do not know this. Nor is the total reduction 123.1 as large as’ 


that obtained by Cramér, 181.6, for two degrees of freedom. However 
it is unfair to expect that a partitioning of X° of general utility should be 
as efficient—in the sense of reducing X’—as a partitioning specifically 
designed for a given null hypothesis distribution. For the Edgeworth 
series analysis by Cramér arises naturally out of the orthogonal poly- 
nomials for the normal density function. 

This leads us to the suggestion made by Lancaster [1953, 1957, 1958] 
that partitioning of X’ should often be based on the series of orthogonal 
polynomials whose weight function is the null hypothesis density. 
Lancaster has not shown how to deal with the effects of grouping and 
estimation. Any theory taking these into account, as Barton has done 
for his partitioning, runs into difficulties unless the unknown parameters 
are those of scale and location. For suppose that the density on the 
alternative hypothesis may be written as 


na) = | (5.15) 
where \, = 1 and the f,(z) are polynomials such that 
has degree r, 


i, r=s8 


IT 

8 

Nn 

c 

I 
¢ 

6 

J 

b 

B 

B 

B 

C 

C 

C 

Cr 

D; 

Fi 


CHI-SQUARE GOODNESS-OF-FIT TESTS 467 


Then if f(a) contains parameters, the restrictions (5.16) will mean that 
the f,(x) must also contain them. If the parameters are only introduced 
into (5.15) by replacing x by (« — 6,)6;", this is not so and standard 
families of polynomials may often be used. However the theory given 
in Section 2 to which, we have shown, all cases may be reduced—may 
be used to construct suitable linear functions for the partitioning of X’. 
With class boundaries and probabilities g, and p, as before, quantities 


1 
Pa fla) de 


may be defined, leading to linear forms N~'”? >°\_, n,F,, which are the 
natural analogues of the linear forms which are appropriate when the 
sample is ungrouped. They are not satisfactory here because they are 
not uncorrelated with unit variance. By using the earlier theory this 
could be overcome, along with the troubles due to the fact that the 
F,, will contain estimated prameters, but no further discussion can be 
given here. 


6. Acknowledgments 


It is a pleasure to acknowledge some helpful general remarks by 
John W. Tukey and some particular remarks on conditional distributions 
by C. P. Steck. 


REFERENCES 


Barton, D. E. [1953a]. On Neyman’s smooth test of goodness-of-fit and its power 
with respect to a particular set of alternatives. Skand. Akt. Tid. 36, 24-63. 

Barton, D. E. [1953b]. The probability distribution function of a sum of squares. 
Trabajos d’Estadistica 4, 199-207. 

Barton, D. E. [1955]. A form of Neyman’s yi test of goodness of fit applicable to 
grouped and discrete data. Skand. Akt. Tid. 39, 1-17. 

Barton, D. E. [1956]. Neyman’s ¥; test of goodness-of-fit when the null hypothesis 
is composite. Skand. Akt. Tid. 40, 216-45. 

Chernoff, H. and Lehmann, E. L. [1954]. The use of maximum likelihood estimates 
in x? tests for goodness of fit. Ann. Math. Stat. 25, 579-86. 

Cochran, W. G. [1952]. The x? goodness-of-fit test. Ann. Math. Stat. 23, 315-45. 

Cochran, W. G. [1954]. Some methods for strengthening the common x? tests. Bio- 
metrics 10, 417-51. 

Cochran, W. G. [1955]. A test of a linear function of the deviations between ob- 
served and expected numbers. J. Amer. Stat. Soc. 50, 377-97. 

Cramér, H. [1954]. Mathematical Methods of Statistics. Princeton: Princeton Uni- 
versity Press. 

Darwin, J., [1958]. On corrections to the chi-squared distribution. J. R. Stat. Soc. 
20, 387-92. 

Fisher, R. A. [1958]. On a property connecting the chi-square measure of discrep- 
ancy with the method of maximum likelihood. Alti de Congress Internazionale 
dei Mathematici, Bologna, 6, 94-100. 


a} 
ae 
4 
EY 
4. 
| 
| 


468 BIOMETRICS, SEPTEMBER. 1959 


Gani, J. [1955]. Some theorems and sufficiency conditions for the maximum likeli- 
hood estimator of an unknown parameter in a simple Markov chain. Biometrika 
42, 342-59. 

Good, I. J. [1957]. Saddle-point methods for the multinomial distribution. Ann. 
Math. Stat. 28, 861-81. 

Halmos, P. and Savage, L. T. [1949]. Application of the Radon-Nikodeym Theorem 
to the theory of sufficient statistics. Ann. Math. Stat. 20, 225-41. 

Hoel, P. G. [1938]. On the chi-square distribution for small samples. Ann. Math. 
Stat. 9, 158-65. 

Lancaster, H. O. [1953]. A reconciliation of chi-square, considered from the metrical 
and enumerative aspects. Sankhya 13, 1-10. 

Lancaster, H. O. [1957]. Some properties of the bivariate normal distribution con- 
sidered in the form of a contingency table. Biometrika 44, 289-92. 

Lancaster, H. O. [1958]. The structure of bivariate distributions. Ann. Math. 
Stat. 29, 719-36. 

Neyman, J. [1937]. ‘Smooth test’’ for goodness-of-fit. Skand. Akt. Tid. 20, 150-99. 

Rao, C. R., and Chakravarti, I. M. [1956]. Some small sample tests of significance 
for a Poisson distribution. Biometrics 12, 264-82. 

Roy, A. R. [1956]. On chi-squared statistics with variable intervals. Department 
of Statistics, Stanford University, Technical Report. 

Roy, S. N. and Mitra, S. K. [1955]. An introduction to some non-parametric general- 


izations of analysis of variance and multivariate analysis. North Carolina Insti- ° 


tute of Statistics Mimeograph Series 139. 

Watson, G. S. [1957a]. Sufficient statistics, similar regions and distribution free 
tests. J. R. Stat. Soc. Series B, 19, 262-67. 

Watson, G. S. [1957b]. The chi-squared goodness-of-fit test for normal distributions. 
Biometrika 44, 336-48. 

Watson, G. S. [1958]. On chi-square goodness-of-fit tests for continuous distribu- 
tions. J. R. Stat. Soc., Series B, 20, 44-61. 


n 
0 
V 
a 
4 
W 
ft 
tl 
ir 
4 tl 
cl 
b 


THE SAMPLING VARIANCE OF THE GENETIC 
CORRELATION COEFFICIENT 


ALAN ROBERTSON* 


Institute of Animal Genetics 
Edinburgh, Scotland 


In the genetics of continuous variation, the genetic correlation 
coefficient between any two characters plays a part in the discussion of 
correlated responses under selection, and of the combination of different 
measurements to secure maximum improvement, the so-called selection 
index. Estimation methods for the genetic correlation coefficient are 
similar to those used for the partitioning of variances of the separate 
characters—on the one hand, the analysis of the correlations between 
offspring and parent and, on the other, the analysis of the variance and 
covariance components for the two characters within and between 
groups of relatives. In the former case, formulae for the sampling 
variance have been derived by Reeve [1955], but none, even crude 
approximations, is available for the latter. It is the object of the 
present paper to present formulae for the special case in which the two 
characters have the same heritability (i.e. the same intra-group correl- 
ations). The broad pattern of the results does, however, give a very 
strong hint of the probable values in the more general case. 

Expressing the problem purely in statistical terms, we have a one- 
way classification with measurements of two variables. We are then 
concerned with estimating the correlation between the true (as distinct 
from the observed) class means for the two characters. The procedure 
then entails the calculation of the between-class variance and covariance 
components and the derivation of the correlation coefficient from these 
in the usual way. The problem has been discussed formally by Kemp- 
thorne [1957 pp. 264-7] who uses the term intrinsic correlation. The 
use of the intrinsic correlation as a genetic correlation implies that each 
class is a genetic entity and that there is no non-genetic variation 
common to members of the same class. The nature of the similarity 
between class members is not relevant to the general statistical solution 


*Member of staff of Agricultural Research Council of Great Britain. 


469 


he 
i 
3 
ag 
| 
A 
: age 


470 BIOMETRICS, SEPTEMBER 1959 


but they may be full- or half-sisters or merely members of the same 
genetic strain. We shall refer to members of the same class as relatives. 

The usual form of the analysis can be presented for N groups of n 
relatives each. ‘The following table shows the expected values of the 
variances, mean squares, and cross products. The number subscripts 
refer to the characters and the letters to the between and within group 
components. 


TABLE 1 
ExpEcTED VALUES OF VARIANCES, MEAN SQUARES, AND Cross Propucts 


d.f. Variances Covariance 


(1) (2) 


Between groups| N—1 | oj, +03, 03, + no3,| cov(1, 2). +n cov(1, 2), 
Within groups | N(n —1) | o?, we; cov(1, 2). 


If the design of the experiment is such that the between group 
components are entirely genetic in origin, then the correlation coefficient 
is given by 


_ cov (1, 


9 


As the essence of this approach is to cast the analysis in the form of 
a single analysis of variance, we must first consider the simple correlation 
between pairs of measurements X, Y in terms of the analysis of variance. 
First we transform both variates into standard measure by expressing 
them relative to their true means and dividing by their respective 
standard deviations. Each variate x, y now has unit standard deviation. 
The expected between pair mean square is of the form E[(x + y)*/2] 
and that within pairs is of the form E[x — y)’/2]. The total variance in 
the two characters therefore splits into two parts, 1 + r in the between 
group mean square and | — r in the within group mean square. 

Returning now to our N groups each with n pairs of observations, 
in each character we can break down the total variance into between 
group and within group components. If, and only if, in the two 
characters the between group variance makes up the same proportion 
of the total variance, we can, by dividing each deviate by its total 
standard deviation, reduce the situation to one in which, both in between 
and in within group components, the two characters have equal variance. 
In each character, let us assume that the between group component in 


lt 


¢ 
1 
; 
( 
7 
I 


GENETIC CORRELATION CORFFICHENT 471 


each character makes up a fraction ¢ (the intra-class correlation) of the 
total variance of each character. Now taking the two characters 
jointly, the total between group variation 2/ in the two characters we 
break into a fraction (1 + r,) for that between groups (averaged over 
characters) and ¢(1 — r,) for a group X character interaction. We 
similarly break down that within groups 2(1 — #) into fractions (1 — #) 
(1 + 7r,,) and (1 — #) (1 — r,) respectively, where r,, is the within group 
correlation. 

For those who think conveniently in terms of linear hypotheses, we 
are assuming that the measurement of character 1 in the jth member of 
the ith group is given by 


Gri + C3; + 
and that of character 2 by 


where g,,; is the value of the 7th group for character 1, c,; is the within 
group element of animal (7j) common to both characters, and the e’s 
are the residual errors, presumed to have equal variances for the two 
characters. Because o;,, = o;3,, , the first two terms will be statisti- 
‘ally independent. 

Bearing in mind the size of each group, the analysis of variance then 
reads as in Table 2 where oj. is the standard variance, ¢ is the intra- 
group correlation of each measurement and r, , 7, are the between and 
within group correlations. 

In working out the expected mean squares, it must be noted that 

(a) in any character, a fraction ¢ of the variance appears between 


TABLE 2 
ExpEcTeD MEAN SQUARES FOR THE COMPLETE ANALYSIS OF VARIANCE 
d.f. =xpected M.S. Designation 

Characters 1 not relevant — 
Groups N-1 — O11 + rw) + nt(1 + 7,)) A 
Groups X 
characters N-1 [((1 — t)(1 — rw) + nt(1 — 1,)] B 
Within groups 
Individuals N(n — 1) — + C 
Remainder N(n — 1) o7 — — D 


i 
j 
| 
1 = 
# 
| 
fits 
n 
il 
n 


472 BIOMETRICS, SEPTEMBER 1959 


groups and 1 — ¢ within groups (¢ will of course depend on the herit- 
ability of the characters and on the degree of relationship between 
members of the same group), 

(b) the correlation within groups, r,, , will not be the same as that 
between groups, r, , and 

(c) variation between individuals makes no contribution to the 
group X character interaction mean square. 

The expected mean squares can then be written as in the table. 
Writing the four mean squares as A, B, C, D, we then have 


~~ 
A-C+B-D 


If we assume that the parent distributions are normal, we have for 
the variance of A, V, = 2A*/(N — 1) and so on. 


r, is of the form (a — 8)/(a + 8), where a = A — Cand @ = B — D. 


= 


Then 
VG.) = + 8) 
=), since i; 
_ BV. +a'V, 
4(a + 
= (Vall + Vil 
since 


Since the separate mean squares are statistically independent, we may 
write 

) 

= 

so that the problem is formally solved. However, it is desirable to find 


a solution in terms of the basic parameters, é, 7, , and 7, . Obviously 
any such solution will be of the form 


ih 
“al 
| 
‘ 


id 


GENETIC CORRELATION CORFFICTENT 473 


Q ). 


In evaluating P and Q, we make repeated use of the identity, 
a ( 2)+\ 9 


— OU +r.) + ntl 
+ +7,)°[(1 — — re) + — 


and similarly Q = (1 — #)? (r, — ru)? + (1 — r,r.)?. 

This -appears to be as simple an expression as can be obtained. 
Before discussing the form of the solution and its behaviour in special 
cases, a slightly different form of the analysis must be mentioned in 
which the two characters are in fact measured on different individuals. 
This occurs for instance in the case of milk production and carcass 
conformation in dairy cattle and, in general, when the two characters 
are in fact the same measurement under two different treatments. The 
analysis of variance is then as shown in Table 3: 


Then 


TABLE 3 
MEAN SQuARES—CHARACTERS MEASURED ON DIFFERENT INDIVIDUALS 
dfs Expected M.S. Designation 
Between groups N-1 1—t+nt(1 + 7,) A 
Group X character 
interaction N-1 1—t+nt(1 — B 
Within groups 
and characters 2N(n — 1) 1-1 Cc 
Then 


and, in a similar way to the first. case 


V'(,) 


1 2 bd 2 2 
We — 7,)° + Val +17,)° + Ve(4r’)] 


Q 
E Nin 5 | 


j — 
3 
ats 
| 
| 
y fc 
= 
ir 
ly 
4 | 


474 BIOMETRICS, SEPTEMBER. 1959 


where 
= +0 +r — 
and 
Q = A1 — 


As might be expected, this is similar to what we get from putting 

= 0 in the earlier case, with the difference of the second term in Q. 
This arises from the lumping together of the two within group terms into 
one. 

Examination of the expression for V — V’ shows that this will be 
negative if both n and r,r, are large. It will be positive if n is small and 
r,, large and much bigger than r,. However, it is only very occasionally 
that we have any choice of method in practice. 


The behaviour of the solution in special situations 


(i) nt large. As this means that n is large, it also implies that 
P>Q. Then P —n’? (1 — 72)’ and V(f,) (1 — — 1). 

The variance within groups then makes little contribution to the 
sampling error and we are left with the usual formula for the sampling 
variance of a correlation coefficient. 

(it) r, = r.. When nis large we will in general know r,, much more 
accurately than we know r, and a formula for sampling error assuming 
rT, = Ty is useful. 


=(1—7)*[1 + ( — 
— 0’, 


and 


= n—1 + 
If we write the denominator of the last term as (NV — 1) (n — 1), 
this will involve an overall proportional overestimate by a trivial factor 
of less than 1/N*(n — 1). The formula then reduces to 


= —1 n(n — 


It should be noted that the simplest formula for the sampling variance 
of the estimate of ¢ from the same data is 


(N — 1)n(n — 1) 


| 
3 
2 2 4 
vii) ai — +m — 
it 


GENETIC CORRELATION COEFFICIENT 475 


Since the estimation of ¢ is in essence the estimation of the herit- 
ability, h’, (the two differing by a factor dependent on the degree of 
relationship within groups), we may write 

ht 


Vi)? 


and 


VA) 


Vii) 


since the other terms will be similar in magnitude. It follows then that 
unless both r, and h’ are high, the sampling variance of r, will be much 
greater than that of h®. We shall return to this later in the search for 
a more general solution. 

But one interesting modification of the formula may be noted. 
Writing s.e. for standard error and C. V. for coefficient of variation we 
have 


1—r,se(h) C.V.(h’). 


V2 Ww 


Thus in any experiment the standard error of r, will be slightly less than 
the coefficient of variation of h’. 
When n = 2, the above formula reduces to 


s.e(f,) 


ite 
~ 


Now the case, previously derived by Reeve, referred to the coefficient 
calculated by comparison of correlations between the different measure- 
ments on daughters and dams, for which he obtained a solution, when 
r, = rT, = 0, of [3 + (2/h*)]/(N — 1). Now a daughter-dam pair is a 
group of animals of size 2 with a genetic relationship of 0.5, i.e. £ = 3h’. 
Substitution in the above gives 


vie = + 


in comforting agreement with Reeve’s formula. 
In general, if ¢ is small and xn > 10, we can without great error use 
the formula 


| 
Ts 
4 
a 
| 
| 
i 
| 
¥ 
| 
Aa. 
—} 
N-1 nt 
= 
| 


476 BIOMETRICS, SEPTEMBER 1959 


(tit) r, = 0. This would arise in answering the question, “Should 
I expect any change in one character on selecting for another?” 


— 0, 
Q=(1- 


V(f,) will thus increase as r,, increases but even in extreme cases of 
low values of nf and r,, = 1, the sampling variance can only be doubled. 

(w) r, = 1. This would arise in answering the question, “Is there 
any genotype-character interaction in this case?” All the sampling 
variance then comes from the variation within groups. 


P = 21 —r,)’, 
Q = 21 — )*(1 


= nt E 1 + N(n — 7 


™(N — 1)n(n — 10° 


But if we are actually dealing with a genotype-environment inter- 
action, the analysis would be of the second kind and 


P’ = 21 — 0, 
21 — 2)? 


— Ne 


Some of the problems arising here are rather complex and they will 
be dealt with later in the discussion. 


The general solution 


So far we have had to deal with the case in which the two characters 
had the same heritability. An attempt to extend our approach to that 
of differing heritabilities would break down because of the correlation of 
mean squares introduced. But it seems possible to make a reasonable 
prediction of the probable magnitude of the variance from the following 
special cases: 


G 


(2 
ex} 


wil 
ho 


Th 


do 
clu 
sta 


q 
di 
j (i 
4 
¢ 
de 
sat 
tel 
i 
‘ 


GENETIC CORRELATION COEFFICIENT 477 


(i) Daughter-dam correlation, r, = 1» 


ve) =" 2) 


due to Reeve. 
Now V(h?) = V(h2) = 4/(N — 1) and therefore 


(ii) Analysis of variance and covariance—n very large 


= 


N-1 
2 
= 
2 
= 


V@,) 
2 
(iit) hi? = h?, r, = r. . It was shown earlier that in this case, the 
expressions gave 


V@,) 


It is quite certain that in the general case, V(/,) will have in its 
denominator a term in h{h; and we know that a relationship that would 
satisfy this also holds in three quite different special cases. One is 
tempted to suggest that the relationship 


ve) 
VVh)V (hs) 


will not be too bad an approximation, except obviously in the neighbor- 
hood of r, = 1 and whenr, — r, is large. 


The analysis of genotype-environment interactions 


The analysis of genotype-environment interactions, deriving as they 
do from analyses of variance, have been in the main confined to con- 
clusions that. interactions have or have not been detected. These are 
statistical statements and do not specify whether any statistically 


| 
= 
: 
ut 
of 
| 
| 
2 
| 
bef 
4 


478 BIOMETRICS, SEPTEMBER. 1959 


significant interaction has biological importance or whether the design 
was such that biologically important interactions would have been 
detected. In the context of one environment, the statements are then 
at the level of the statistical detection of the genetic control of a character 
which is not of great value until it is quantified as a heritability. 

In our framework, in which we are concerned with the use of groups 
of relatives in the analysis of the problem, an interaction (implying 
that real differences between groups are not the same in the two environ- 
ments) may be due to two completely separate biological phenomena. 
In the first place, the between group variance component may be 
different and in the second, the true ranking of groups may be different, 
Falconer [1952] made a considerable conceptual advance by pointing 
out that the second cause of interaction may be considered in terms of 
the genetic correlation between performance on the two environments. 
Writing o%, , o;, for the within and between group components on en- 
vironment 1, with corresponding symbols for environment 2, then, also 
using our previous terminology, the expected absolute mean squares in 
the detection of interaction are given by: Group by character interaction, 

+- 


2 +5 — O42)” + — 7,)] 


and within groups and character, 


2 2 
+ 


2 


The interaction mean square shows clearly how the two effects 
contribute. It appears that the simple detection of interaction by the 
standard method is more sensitive to deviations of r, from 1 than to 
changes in the variance in the two environments. An analysis of the 
standard type would, however, seem inadequate to describe the full 
implication of such an experiment. 

The statistical limitations of our approach to the standard error of 
r, do in fact restrict us to a model in which interaction of the first type 
does not occur. The use of the framework of the genetic correlation 
between performances in the two environments allows us to measure 
the importance of the interaction. We are evaluating the importance 
of the interaction variance by expressing it. in terms of the variance due 
to the average genetic effect over the two environments. The detection 
of interaction now becomes the measurement of the genetic correlation 
coefficient. More important, we can now calculate the sampling variance 
of the measurement so obtained and be in a position to say whether the 
interaction is biologically important, i.e. whether r, is much different 


G 


ol 


nu 


fi 

b 

n 

a 

p 

di 

ul 

a 

tl 

in 

of 

W 

of 

th 

de 

el 

a th 

m 

m 

in 

m 

in 

ar 

A as 

— 


ects 
the 
1 to 
the 
full 


r of 
tion 
sure 
ance 

due 
ction 
tion 
the 
rent 


GENETIC CORRELATION COEFFICIENT 479 
from unity or whether a biologically important interaction could have 
been detected in an experiment of this size. 

No interaction means a genetic correlation of unity. How much 
must the correlation fall before it has biological or agricultural import- 
ance? I would suggest that this figure is around 0.8 and that no ex- 
periment on genotype-environment interaction would have been worth 
doing unless it could have detected, as a significant deviation from 
unity, a genetic correlation of 0.6. In the first instance, I propose to 
argue from the standpoint of a standard error of 0.2 as an absolute 
minimum. 

These experiments may be of several kinds. Within populations, 
the groups will be of relatives so that some idea of the value of the 
intra-class correlation ¢ will probably be available to help in the design 
of the experiment. On the other hand, the groups may be of strains 
within a breed or even of different breeds. The value of / may not then 
be predictable a priori although there will generally be some indication 
of the mean performance of the different groups. The wider the material, 
the greater the value of ¢ and correspondingly the smaller the variance of 
f, in an experiment of a given size because of the appearance of ¢ in the 
denominator in the expression for V(?,). We may also have two or more 
environments, an example of the latter occurring in dairy cattle where 
the progeny of bulls used in artificial insemination are scattered over 
many herds. The correlation coefficient r, may then be looked on as the 
mean correlation averaged over all pairs of environments or as an 
intra-class correlation—the proportion of the variance between group 
means Which is common to all environments. In the treatment which 
immediately follows, we should be concerned with an analysis within a 
population and involving two environments. 

We then have to turn to the analysis in which the two measurements 
are made on different animals. We have 


The extreme inefficiency of using low values of n is immediately 
obvious. If / is small and 2 is large, we may simplify to 


yay ~ — +r 


If r, = 1, this would suggest that for a given value of Nn (the total 


number of individuals measured) we should make our groups as large 
as possible. 


But for values of né above unity, it is obvious that V’(?,) will increase 


4. 
= 
en 
er 
ps 
ng 
1a. 
Ing 
of 
| 
its. 
Iso 
in 
on, 
t 
a 
| 


480 BIOMETRICS, SEPTEMBER. 1959 


as r, decreases. If we make n too large, we decrease our chance of 
finding a significant interaction when none in fact exists at the expense 
of increasing the risk of saying that r, = 1 when in fact it is different 
from unity by an important amount. It seems then that we should 
decide on the optimal value of n by taking an intermediate value of r, , 
say at r; = 0.5, for simplicity. The minimum of \’’(/,) occurs at a 
value given by 


ltr 
If r?} = 0.5, this gives a value of 2.4, i.e. more than twice as large 
a group in each environment than is the optimum for heritability 
estimation. These algebraic manipulations are illustrated in Fig. | 
6- 
x 
> 4 
> 
2 
o's 
2- 
2 
9 
T T 
° 20 40 60 80 


Family size in each environment 


FIGURE 1. 


Tue Errect or FaMiLy SizE IN V'(r,) wiTH = 0.05 AND THE NUMBER OF ANIMAL 
IN ENvironMEnNT (Nn) Equat To 2000. 


= 
| 
| 
( 
( 
a 
a 
+ 
Ir 
R 


MALS 


GENETIC CORRELATION COEFFICIENT 481 
which shows \’’(/,) for different values of x with ¢ — 0.05 and r? = 0.5 
and 1 respectively. 

To take a specific example, let us take ¢ = 0.05 and wish to have 
a standard error of r, of 0.2. At the optimum, the group size in each 
environment would be 48 animals. The expected variance of f, is then 
roughly 0.88/(N — 1) implying that 23 groups are needed and a total 
of 2200 animals. Substituting in the standard formulae, we find for the 
standard error of ¢ the value of 0.020, implying for half-sib groups a 
standard error for the heritability of 0.080 in each environment. If 
on the other hand we wish to get maximum information about the 
heritabilities, we should take n = 20, implying, for the same total number 
of animals, 55 sires. The standard error of the heritability becomes 
0.072 and of the genetic correlation 0.22. 

In a previous paper [Robertson, 1957] it was shown that for a half- 
sib analysis with structure in the neighbourhood of optimum, we had 
approximately V(h?) = 32h’/T where T is the total number of animals 
measured in that environment. A similar manipulation of the formulae, 
assuming that we are interested in the neighbourhood of r? = 0.5, gives 
V(?,) to be roughly equal to 10/Th’, where T is the total number in 
each environment. 


Interaction over several environments 


In the case in which the groups are split over several environments, 
again assumed to have the same intra-class correlation within each, the 
analysis of variance becomes as follows in Table 4: 


TABLE 4 
ExrrEcTED MEAN SQUARES FOR SEVERAL ENVIRONMENTS 
d.f. Expected M.S. Desig- 
nation 
Environments P-t not relevant 
Between groups N-1 1 + + (p — A 
Group X Environment ! 
Interaction (N — — 1) | 1 —7,) B 
Remainder Np(n — 1) c 
This gives 
A-B 


4-€46- 


50 
of 
ise ne 
ld 
a 
ity 
= 
= 
| 
f 


482 BIOMETRICS, SEPTEMBER 1959 


and 

VG) = (Vall — + Voll + @ — + Veoh) 
with the usual expressions for V4, Ve ,and Vc. The first two terms 
then simplify further to give 


— t+ — + @ — + @ — 
(N = — 


= 


— 
— 

Algebraic examination of the optimum values is difficult but it 
appears that for a constant total population (Npn), the most efficient 
value of p is in the neighbourhood of 3 to 4 and that of n (the size of the 
group in one of the environments) is in the neighbourhood of 1.5/¢. This 
has been calculated on the assumption that we are mostly concerned 
with evaluating V’(f,) when r, is in the neighbourhood of 0.7. 

Such calculations as this are frequently carried out within sub-units. 
In dairy cattle, the analysis may be done within breeding units, for 
instance. It must be stressed that in such a case the number of en- 
vironments, p, in the above equations is the number in each sub-unit 
and not the total number. Examination of the equations brings out 
very clearly the great inefficiency of low values of n, the average number 
of progeny per group per environment, a problem frequently met in 
analysis of dairy cattle data from artificial insemination stations. In 
such a situation, any method of classification of the different environ- 
ments in order to increase the effective value of n will greatly help in 
the analysis. 
Discussion 

We should perhaps discuss first of all the shortcomings of this 
approach. In fact, our transformation of the usual variance and co- 
variance analysis into a simultaneous analysis of variance of both 
characters is not exact. To do it at all, we have to assume that the two 
characters have the same intra-class correlation. In the usual method, 
the two intra-class correlations are estimated separately; in our trans- 
formation they are estimated jointly, as the mean of the two for the 
separate characters. This difference is not as large as it might seem 
because in the first method we promptly use the two estimates as their 
geometric mean. The difference between the two procedures then lies 
merely in taking the arithmetic mean of the two estimates in the one 


¢ 
| 
4 
t 
h 
I 
el 
ve 
ol 
in 
de 
in 
er 
| th 
CO 
th 


GENETIC CORRELATION COKFFICIENT 483 


case and the geometric mean in the other. Secondly, it must be empha- 
sised that we are dealing with the variance of a ratio and that therefore 
the formula can only be expected to be valid if the samples are large. 
It may still be valuable, however, in indicating the order of magnitude 
of the variance of #, . 

As far as the determination of the genetic correlation coefficient is 
concerned, the two main points that emerge from this treatment are 
that the form of the expression for its variance, as a function of group 
number and group size, is very similar to that for the variance of the 
heritability estimates, but that the denominator contains the square of 
the intra-class correlation coefficient. This means that the optimal 
design is the same as in the estimation of the heritabilities, but that the 
sampling variance will be much larger. If the two characters have the 
same heritability and the genetic correlation is not too large and of the 
same magnitude as the phenotype correlation, then for a given sampling 
variance 1/2¢’ times as many observations are needed for the genetic 
correlation coefficient as for the heritability. It must be remembered 
that in general we do not need to know the genetic correlation as ac- 
curately as we do the heritability. But this does underline the difficulty 
of the estimation of the genetic correlation of characters with a low 
heritability. From a personal point of view, it seems that the acceptable 
error in a genetic correlation coefficient is not very dependent on the 
magnitude of the coefficient itself, but in the case of the heritability it is. 
In other words, if a standard error of 0.1 is tolerable for a heritability 
estimate of 0.5, it is certainly not for a heritability of 0.1. Perhaps the 
tolerable limits of heritability estimates are best expressible in terms of a 
coefficient of variation, though the argument breaks down for very low 
heritabilities, which are merely written off as being effectively zero. 
If, as a working hypothesis we take a sampling coefficient of variation 
of 20 per cent as tolerable for a heritability, this will give us standard 
errors of genetic correlations between two such characters of 0.14, if 
r,is small. Thus, it seems that we should not say that if h’ is low, it is 
very difficult to estimate r, , but rather that it is equally difficult to 
obtain estimates of tolerable accuracy of both h’ and r, , defining tolerable 
in terms of the coefficient of variation of h’ and of the absolute standard 
deviation of r, . 

The quantitative expression of genotype-environment interactions 
in terms of the genetic correlation between performance in two or more 
environments is of value in giving a measure of the practical, rather 
than the statistical, significance of the results. In this case, we are of 
course mostly concerned with coefficients in the range from 0.5 to 1 and 
the formulae presented then allow for a design of experiment. in which 


| 3 
| 
| 
] 
Hits 
lise 
r 
t 
t 
n 
n 
r 
5 
1S 
235 
th 
vo 
d, 
1S- 
he 
m 
elr 
ne 


484 BIOMETRICS, SEPTEMBER 1959 


deviations of r, from unity of practical importance would be detected 
statistically. The optimal design of such an experiment with two 
environments would apparently entail a larger group size in each en- 
vironment than is optimum for a heritability estimate. Fora heritability 
estimate using sire groups only, the optimal number of progeny per sire 
is about 4/h’, whereas for the estimation of the genetic correlation 
between performance on two environments the optimal fofal group size 
is some four to five times this figure. With this optimum group size, it 
seems that, if we are to aim at a standard error of r, of 0.2 in the range 
from 0.5 to 1, the total number of animals required for an adequate 
analysis of genotype-environment interaction using half-sib groups is 
in the neighbourhood of 500/h’, where we assume that the heritability 
is the same in the two environments (or in the case of strains with an 
intra-strain correlation of ¢, of 125/?). 


Summary 


1. Formulae are derived for the sampling variance of the genetic 
correlation coefficient calculated by the analysis of variance and co- 
variance components. 

2. The formulae are very similar to those for the sampling variance of 
heritability from the same experiment, except that the denominator 
now contains the square of the intra-class correlation coefficient. The 
optimal structure of the experiment is therefore the same in the two 
cases. In any experiment, it appears that the standard error of the 
genetic correlation coefficient when the latter is small will be of the same 
magnitude but somewhat smaller than the sampling coefficient of varia- 
tion of the heritability. 

3. An attempt is made to suggest the form of the more general 
solution, in which the two characters have different heritabilities. 

4. The design of genotype-environment interaction experiments is 
also discussed as cases in which the genetic correlation approaches one. 
The optimal structure then involves groups in each environment more 
than twice as large as that for the determination of the heritability in 
that environment. 

5. The treatment of genotype-environment interaction in terms of 
genetic correlation is extended to the case of more than two environ- 
ments. 


Note added in proof 

A general solution to this problem has now been found by Dr. 
G.N. Talliss using a completely different approach and will be published 
in the Australian Journal of Agricultural Research. It reduces to that 


GI 

q pre 

sus 

be 

Fal 

2 

Kel 

Ree 

Rol 

E 
{ 
1 


t 


GENETIC CORRELATION COEFFICIENT 485 
presented here when the two characters have equal heritabilities. The 
suggestion in the present paper that a simple general solution might 
be found when r, = 7, is not borne out. 


REFERENCES 


Falconer, D. 8. [1952]. The problem of environment and selection. Amer. Nat. 86, 

293-98. 

Kempthorne, O. (1957]. An Introduction to Genclic Statistics. John Wiley and Sons, 

New York. 

Reeve, E. C. R. [1955]. The variance of the genetic correlation coefficient. Bio- 
metrics 11, 357-74: 

Robertson, A. [1957]. Optimum group size in progeny testing and family selection. 
Biometrics 13, 442-50. 


= 
fe 
| 
i 
| 
aa = 
} 
| 
ol 
ABS 
d 
i 


QUERIES AND NOTES 
D. J. Finney, Editor © 


Test of Difference Between Treatment and Control 
140 QUERY: with Multiple Replications of Control and a 
Missing Plot 


In Query 108 (Biometrics 10: pp. 298-9, [1954]) you gave a formula 
for a missing plot among the three check plots in one block. I have 
similar experiments where the checks are randomly assigned to the 
blocks. In one there is a missing value among the checks; in another it 
is a treatment. I’d like to test the significance of the difference between 
a treatment and the check. The formula I have does not allow for the 
extra checks. How can it be modified? 


It is convenient to use the analysis of covariance technique 
ANSWER: | as described by Irma Coons in ‘“‘The analysis of covariance 

as a missing plot technique,” (Biometrics 13: pp. 387-405, 
[1957]). Let X be the concomitant variable which takes the value 
— nin the missing plot and zero elsewhere and let Y be the observed 
variable which takes the value zero in the missing plot. Let E,, , F., , 
and E,, be the sums of squares and products for error in the analysis 
of covariance. Then, for the case where the missing observation is in 
a control plot it can be shown, using the notation of Query 108, that 
the missing observation estimate, 


nf +ce-—2)—- »| 


c 


_ (t+e—1)T cbB G 
ebt+e—2)—t+1 ’ 


the difference between the adjusted control mean and any adjusted 
treatment mean 


an 


an¢ 


Ql 
to 
: ful 
tre 
ob 
th 
be 
ral 
= 
| 
| 
nE 
np 
adj. Y. — adj. = (Y. — 
and 
486 


QUERIES AND NOTES 487 


The coeflicient of the missing yield in Y, — Y, is 1/cb, so the addition 
to the variance is o’n’/c’b’E,, which equals 


o (t — 1) 
cb [cb(t +c — 2) —t+ 1] 


Exact tests of significance can be made using 


3 E.. 


as an estimate of o” with d.f. equal to the error degrees of freedom with 
full data, less one. 

It is to be observed that in a randomized block design with a control 
treatment replicated c times within each block and with a missing 
observation, L,, is not given by n (error degrees of freedom), as is often 
the case, but must be calculated directly. 

In the problem given in Query 108 we have for a test of the difference 
between the check mean and the seedling A mean eet complete 
randomization): 


212.4 77.0 12.337 
adj. adj. = (212 6 )+ (3)(6) = —0.348, 


V (adj. Y, — adj. Y4) = 0.2260’, 


2 1 (497.6)° 


| 134-2288 1936 


| = 0.1759, 
and 
a —0.348 
V (0.226)(0.1759) 


If the missing observation is not in a check plot the results are slightly 
different and we have 


= —1.75 with df. = 36. 


= n(b — +c — 2), = (t +e — 1)T + DB G, 


’ 


np = 


and 


V (adj. Y. — adj. 


Davin HoGBEen 
Towa State University 
Ames, Towa, U.S. A. 


| 
ie 
| 
7 
Thad 
au 
| 
| 
n 
4 
Git 
nt 
4 
of 


ACKNOWLEDGMENT 


A recent note in Biometrics by Marvin A. Kastenbaum [1959] 
presented a technique for obtaining confidence intervals on the abscissa 
of the point of intersection of two fitted linear regressions. The author 
of this note wishes to acknowledge the earlier work of R. A. Fisher 
{1948; pp. 142-46] in which the same results were presented using 
almost the identical notation. 


Kastenbaum, M. A. [1959]. A confidence interval on the abscissa of 
the point of intersection of two fitted linear regressions. Biometrics 15: 
323-24. 


Fisher, R. A. [1948]. Statistical Methods for Research Workers. 10th ed: 
Edinburgh: Oliver and Boyd. 


Wa 


A 
el 
U 
p 
h 
al 
tk 
la 
th 
: is 
es 
al: 
on 
ad 
of 
5 
les 
= 
7 488 


ABSTRACTS 


The following papers were presented at joint meetings of the Eastern 
North American Region of the Biometric Society, The Institute of Math- 
ematical Statistics, and the Physical Science Section of the American 
Statistical Association held at the Graduate School of Public Health, 
University of Pittsburgh, Pittsburgh, Pennsylvania on March 19 to 21, 
1959. 


D. G. CHAPMAN (University of Washington, Seattle, Washing- — 
596 ton and North Carolina State College, Raleigh, North Carolina). 
The Analysis of a Catch Curve. 


The age distribution of a random sample from a stable animal 
population provides information on the mortality rates of the population. 
In this paper the maximum likelihood estimates of the mortality rate 
and its asymptotic variance are given. The assumptions underlying 
the more usual regression estimate are studied critically. If the popu- 
lation is not stable and in particular if the recruitment varies over time, 
the estimates of mortality rates are biased. The nature of this bias 
is studied for various possible situations. Finally, it is shown that the 
estimation of the components of mortality due to natural causes and 
exploitation by a regression of catch per unit of effort on effort can 
also be applied to a single catch curve. The procedure is dependent 
on the assumption of constant recruitment but otherwise has several 
advantages over the standard application of the procedure to a series 
of catch per unit of effort data. 


GERALD J. COX (University of Pittsburgh, Pittsburgh, 
Pennsylvania) and CARROL 8S. WEIL (Mellon Institute, 
Pittsburgh, Pennsylvania). Principles for the Analysis of the 
Data of the Lesions of Dental Caries. 


597 


Current procedures of recording and analysis of the data of the 
lesions of dental caries, and the artifacts related to them, fail in various 
ways to be rigidly sound. The deciduous and permanent dentitions 


489 


| 
| 
| 
7 
+ 
f 
ae 
4 
alg 
= 
3 
it 


490 BIOMETRICS, SEPTEMBER 1959 


cannot be considered together and the mixed dentition decay data 
cannot be analyzed. No consideration is given to the possible varied 
dietary conditions of formation of the various components of a complete 
dentition. There is only slight recognition of different times and rates 
of eruption of the teeth. Third molar teeth are generally ignored. 
I:namel, dentin, and cementum are apparently considered as identical 
substances. Distinction is not made between lesions in different sites 
in the teeth such as pits, fissures, grooves, minor cusps, smooth surfaces, 
gingival margins, and interproximal areas in possible relation to different 
causative factors. There is no recognition of the possibility that 
initiation of carious lesions and development of cavities are by different 
mechanisms. 

Incidence of carious lesions is related to initiation, which is itself 
governed by built-in resistance and the potency of the initiating factors. 
Development of cavities is related to promotion factors. The first of 
these is measured by number of cavities; the second has to do with 
sizes of the lesions. The method appropriate to incidence is simple 
counting of frequencies in identical areas of teeth in the population: 
but not summation of the totals. Methods for evaluating increase of 
sizes of cavities are best done by consideration only of relative sizes. 
The data from different kinds of teeth, or areas of teeth, can be properly 
combined. 


JAMES E. GRIZZLE (University of North Carolina, Chapel 
Hill, North Carolina). An Application of the Logistic Model 
in Analyzing a Factorial Experiment When the Data Are Pro- 
portions (A preliminary report). 


598 


An analysis of a multiple factor experiment when the data in each 
cell follow a binomial distribution with parameters p;; and n,; is de- 
veloped by assuming the logit is a linearizing transformation, i.e., 
log pi;/qi; = H+ a; + 8,. To test the null hypothesis Hy : a; = ao, 
where a, is given, maximum liklihood estimates of the individual cell 
probabilities are obtained subject to the restraint >>;; ¢;; log pi;/qi; = 
a , where the c;; are constants determined by the design of the experi- 
ment. Estimation of the p;; subject to this restraint requires the ex- 
traction of the real root smallest in absolute value, \* say, of the equation 
[(a;; — ¢:;A)/(b;; + = e**, where a,; is the number of times 
a result “R” is observed and b;; is the number of times “not R” is 


observed in a cell, a;; + b,; = n,; , and A is the Lagrangian multiplier. 
The Newton-Raphson method provides a simple technique for com- 
puting A*. 


The test statistic x? = — + has, 


tn 


| 
iv 
tl 
hy 
th 
I 
of 
er 
be 
of 
(u 
Ys 
th 
sa 
is 
fre 
ob 
|_| ge 
TI 
cri 
the 
i of 
an 
sh 
the 
fid 
gue 
of 
‘ lea 
to 
| Int 


ABSTRACTS 491 


in the limit, the central x’ distribution with one d.f. if Hy is true. Within 
this framework, tests are developed which correspond to tests of linear 
hypotheses in the analysis of variance for complete block designs, and 
the non-centrality parameters of the tests are found. 


MAX HALPERIN (Knolls Atomic Power Laboratory, Schenec- 
599 tady, New York). Fitting of Straight Lines and Prediction 
When Both Variables Are Subject to Error. 


The problem of joint estimation, in the sense of a confidence region, 
of the parameters a, 8 of a structural relation » = a + yv is considered, 
where both » and vy are presumed to be measured with error and the 
errors of the pair of measured values have a bivariate normal distri- 
bution with zero means and arbitrary covariance matrix. On the basis 


of a sample of n pairs of measurements (y,; , 21), --* , (Yn , a), With 
(unknown) expected values (4, , 71), , (un , ¥), new variables z; = 
y,; 1,2,---,naredefined. Ifl,,l,, --- , 1, are such 


that >ol; = 0, 0 = 1 and the J; are chosen independently of the 
sample results one has immediately that 


= (n — + (DY — (SY lz?) 


is distributed in Snedecor’s F distribution with 2 and n — 2 degrees of 
freedom. By appropriate choice of the 1; , the confidence regions 
obtained by Wald and Bartlett arise as special cases with the slight 
generalization that the errors in z; and y; need not be independent. 
The problem of appropriate choice of the J; is considered using as a 
criterion the minimization of the expected value of the ‘“‘variance’”’ of 
the “estimate” implied by the confidence region; this is in the direction 
of maximizing the probability that the confidence region will be closed 
and therefore useful. The results of this minimization imply that one 
should try in advance to make a shrewd guess as to the true values of 
the »; , say vt and define 1; = (v% — ¥*)/+/ y*)’; the con- 
fidence region will be a valid one irrespective of the quality of the 
guesses. For the case that the v; are known without error, application 
of the suggested criterion leads to the usual confidence region based on 
least squares estimation of and 

Use of the above results for the estimation of the »’s corresponding 
to several new y’s is considered (as in the case of repeated use of cali- 
bration curves) and some suggestions made concerning the confidence 
interval statements appropriate for the predictions. 


Pee 
q | 
NE 
| 
4 
44 
1 
i 


492 BIOMETRICS, SEPTEMBER 1959 


MAX HALPERIN and G. L. BURROWS (Knolls Atomic 
600 Power Laboratory, Schenectady, New York). Variables Sam- 
pling Plans With Special Reference to Small Lots. 


Most variables sampling plans for per cent defective assume a 
Gaussian distribution for elements from which the lot is chosen, if not 
in fact for members of the lot. The operating characteristics of such 
plans show the power of the procedures, not for submitted lots of given 
quality as in the case! of attributes sampling (and as is frequently 
suggested by the labelling of O.C. curves for variables sampling) but 
rather for process quality (averaged submitted lot quality). The joint 
distribution of the number, d, of defectives and of the quality measures 
of a sample of n elements from lots of size N containing exactly D 
defectives shows quite clearly that d is sufficient for D independently 
of the assumed process distribution. Only if one is interested in process 
(average) protection rather than in lot-by-lot protection, is the variables 
information contained in the sample greater than that contained in d. 

Although d is sufficient for D, accepting the lot as a finite sample 


from a continuous distribution of quality measures, one can compute: 


the O.C. of the usual variables plans based on the conditional dis- 
tribution of quality measures for fixed lot proportion defective. Such 
an O.C. curve is unfortunately dependent on the parameters of the 
distribution of quality measures as well as the lot proportion defective. 
For large lots the O.C. curve should converge to that of the usual 
variables plan. Possible alternatives to the usual variables sampling 
plans are considered which take into account the size of the lot and 
possible joint use of attributes and variables information. 

Computations to investigate the behavior (in an average sense) of 
the joint variables attributes plan considered or to investigate the 
conditional O.C. of the usual variables plan involve complex compu- 
tations of the type required for mixed variables-attributes plans. If, 
in addition to taking into account lot size, only variables information 
is used, the necessary computations to characterize the plan considered, 
involve only the non-central ¢ distribution. 


EUGENE K. HARRIS (Robert A. Taft Sanitary Engineering 
601 Center, Public Health Service, Cincinnati, Ohio). Analysis of 
Experiments Measuring Threshold Taste. 


In these experiments, panel members are asked to distinguish by 
taste between concentrations of a test substance and accompanying 
blanks at each point in a series of geometrically increasing concentra- 
tions. Two methods of statistical analysis are derived for estimating 


| 
I 
( 
t 
I 
f 
a 
t 
| 
tr 
hy 
& al 
ne 
la 
es 
pr 
B) 
8) 
kn 
wh 
“te 
ra 


ABSTRACTS 493 


the distribution of individual thresholds, taking into account the chance 
nature of discriminations at concentrations below the true threshold. 

lirst, a simple nonparametric method provides direct, unsmoothed 
estimates of the desired cumulative distribution function (c.d.f.) at 
most or all of the concentration points used in the test series. Second, a 
compound binomial-gamma model is argued to explain variation in 
the total number of correct indentifications (both chance and sure) 
made by each subject. This parametric procedure generally seems to 
fit the observations and will yield smoothed estimates of the c.d.f. at 
all concentration points. 

These two methods are applied to a pair of taste tests conducted at 
the Sanitary Engineering Center, Public Health Service. Results are 
compared and confidence limits appropriate to each method are dis- 
cussed. 


J. EDWARD JACKSON (Virginia Polytechnic Institute, 
602 Blacksburg, Virginia). Multivariate Sequential Procedures for 
Testing Means (Preliminary Report). 


Let x = [ui — bio — M20 » Mp — Myo) Where y, is the 
true mean of the ith variable in a p-variable situation and yj» is the 
hypothetical or standard value for the ith variable. Sequential tests 
are proposed to test the hypothesis H, : x=~'x’ = 0 against the alter- 
native hypothesis H, : xy~'x’ = \’, both for the case where the popu- 
lation covariance matrix © is known and the case where it must be 
estimated from the sample. The standard type of sequential test 
procedure is to continue sampling when ln [8/(1 — a)] < g, < In{(1 — 
8)/a], accept Ho if g, < ln [8/(1 — a@)], and accept H, if g, > 
8) 

If 2 = [%, — — Mao, — spol, x" = and 


T° = nxS'x’ , then for the case where 2 is known g, = — nd’/2 — 
Var? + — 1)/2, p — 1; and when is not 
known, g, = — nd’/2 + In,F, [n/2, p/2; — 1 + T?)] 


where ,F, [a, b; x] denotes a confluent hypergeometric function. 
Similar multivariate sequential tests are also derived for the problem 
of comparing the means of two samples. 


A. W. KIMBALL and E. LEACH (Oak Ridge National Labora- 
603 tory, Oak Ridge, Tennessee). Approximate Linearization of the 
Incomplete Beta Function. 


In radiation mortality studies, an approach known popularly as 
“target theory” predicts that, for some organisms, the probability of 


hed 
g 
i- 
| 
| 


494 BIOMETRICS, SEPTEMBER 1959 


death (A/) as a function of the dose of radiation (17) is expressible as 
an Incomplete Beta Function. From observations on M and D, esti- 
mates of the regression parameters may be obtained by well known 
methods for handling non-linear problems. However, the use of the 
Incomplete Beta Function as the expected value of / involving three 
parameters presents formidable difficulties in computation. To avoid 
these difficulties, an approximate linear transformation was found which, 
for regression purposes, is satisfactory over wide ranges of all arguments. 
In this paper the approximation is derived and its accuracy compared 
with two other useful approximations. 


DONALD F. MORRISON and H. A. DAVID (National Insti- 
tute of Mental Health, Bethesda, Maryland and Virginia Poly- 
technic Institute, Blacksburg, Virginia). Life Distribution of a 
System With Spare Components. 


The system studied consists of n + k like elements n of which are 
needed for operation while k are spares. Upon failure of any element 


the system ceases to function and the element is replaced by a spare. . 


The distribution of the total life L(n, k) of the system is investigated 
when the lives of the elements have independent common probability 
density function f(z), and formal expressions are obtained for general 
n and small k. Both densities and expectations of L(n, k) are related 
to those of certain order statistics, and are evaluated for some simple 
exponential and gamma life distributions. The joint density of total 
life and the number of failures on some time interval (0,T) are employed 
as a means of studying cost functions arising fram the maintenance 
scheme under consideration. 


B. LESLIE PARNELL and MARY E. MOSELEY (Armed 
605 Forces Institute of Pathology, Washington, D. C. ). Computing 
on the IBM-402 Tabulator With a Semi-Binary Notation. 


This paper points out the need for greater utilization of tabulating 
equipment where budget limitations preclude the acquisition of elec- 
tronic data processing systems. A seldom used mathematical artifact 
is dusted off and presented with a procedure for effecting both multipli- 
cation and division on an IBM-402 tabulating machine. The procedure 
is described and a 3-variable correlation problem used to illustrate its 
application. The plugboard diagram shows wiring that will effect the 
computation of the individual sums, squares, and cross-products, the 
contribution that each variant makes to the arithmetic mean of the 
variable of which it is a part, and all the summations required to com- 


d 

] 

| 

604 h 

Pp 

n 

\ 

tl 

4 h 

: 

t] 

ol 

ol 

T 

b: 

th 

fi 


ABSTRACTS 495 
plete the correlation matrix. _The method is applicable to compu- 
tational problems other than correlation analysis. 


CHARLES P. QUESENBERRY and DAVID C. HURST 
(Virginia Polytechnic Institute, Blacksburg, Virginia). Asymp- 
totic Simultaneous Confidence Intervals for the Probabilities 
of a Multinomial Distribution. 


606 


A method is given for finding a set of asymptotic simultaneous 
confidence intervals for the probabilities of a multinomial distribution. 
This set of intervals is obtained by assuming that the sample size is 
sufficiently large for a chi-square variate to give a good approximation 
to the goodness-of-fit statistic. The sect of intervals is then obtained 
by inverting this relationship to obtain intervals for the individual 
parameters. 


T. H. STARKS and H. A. DAVID (Virginia Polytechnic Insti- 
607 tute, Blacksburg, Virginia). Significance Test in Paired Com- 
parisons. 


Experiments involving paired comparisons are considered under a 
model that is more general than the Bradley-Terry and Thurstone- 
Mosteller models. Methods are presented for testing the null hypothesis 
that all treatment ratings are equal against the following alternative 
hypotheses: (1) a specified treatment is better than average, (2) two 
specified treatments have different ratings, (3) the treatment receiving 
the highest score is better than average. 

Two multiple comparison methods are also given. One is a multiple 
range test. on the treatment seores that is analogous to Tukey’s test 
based on allowances. The other is an analog for paired comparisons 
of Scheffé’s method for judging contrasts in the analysis of variance. 

For all except small experiments with tabled distributions, the test 
of the null hypothesis against alternative hypothesis (3), the multiple 
range test, and the method of judging contrasts are approximate tests. 
The first is based on the Bonferroni inequalities, and the latter two are 
based on the asymptotic distributions of their test-statisties. For all 
three cases, the approximations (when needed) are shown to be suf- 
ficiently accurate for most testing purposes. 


ROBERT G. D. STEEL (Mathematics Research Center, U. 8. 
608 Army, Madison, Wisconsin). A Multiple Comparison Rank 
Sum Test: Treatments versus Control. 


A fixed value rank sum test with an experimentwise error rate, for 
comparing several treatments with a control in a one-way classification 


4 

ci 
ir 
t 
| 
| 


496 BIOMETRICS, SEPTEMBER. 1959 


with equal numbers of observations, is developed. Exact tables of the 
test criterion have been prepared for all combinations of k = 2 and 38, 
and n = 3, 4, and 5, where k is the number of treatments, excluding 
control, and n is the number of observations per treatment. An approxi- 
mate distribution of the test criterion is also given. 


ROBERT G. D. STEEL (Mathematics Research Center, U. 8. 
609 Army, Madison, Wisconsin). A Multiple Comparison Sign Test: 
Treatments versus Control. 


A fixed value sign test with an experimentwise error rate, for com- 
paring several treatments with a control in a randomized complete 
block design, is developed. Fxact tables of the distribution of the test 
criterion have been prepared for k = 2, n = 4(1)10 and for k = 3, 
n = 4(1)7 where k is the number of treatments, excluding control, and 
n is the number of replicates. An approximate distribution of the test 
criterion is also given. 


D. J. THOMPSON and D. KODLIN (University of Pittsburgh, 
610 Pittsburgh, Pennsylvania). On Follow-up for Survival in the 
Presence of Movement. 


A common disturbance in follow-up surveys is the loss due to move- 
ment, a loss which makes it impossible to estimate in a straightforward 
fashion the frequency of events of interests irrespective of movement. 

The error committed by deriving such estimate from the non-movers 
will depend on the movement rate, the frequency of the event irrespec- 
tive of movement, and the difference between the frequency amongst 
movers and the one amongst non-movers. Tor values of these quantities 
likely to occur in follow-up of human populations for mortality, the 
error has been tabulated and it appears that for certain combinations 
of such values the error will not be larger than 3 per cent although the 
overall movement rate may be as high as 23 per cent. One should 
keep in mind, however, that the “frequency irrespective of movement” 
is not necessarily the quantity of interest for the epidemiologist who 
wishes to measure the effect of an environmental “‘treatment”’ occurring 
prior to movement. 


G. STANLEY WOODSON (Commission on Professional and 
Hospital Activities, Ann Arbor, Michigan). Application of a 
Factor Analytic Model in the Prediction of Clinical Medical 
Data. 


611 


Clinical diagnoses by a physician are generally based Qn observation 
and classification of the various phenomena presented by the patient. 


| 
t 
I 
t 
a 
t 
i 
j 
a 
t 
{ 

a 

4 
4 

HG 


ABSTRACTS 497 


These may be of the nature of certain signs or symptoms or they may 
be supplemented or replaced by quantitative laboratory determinations. 

In either event the physician is generally not dealing directly with 
the clinical entity which troubles the patient. That is, the physician 
must draw inferences on the basis of observable events and upon the 
measurement of certain variables. 

In appendicitis, for example, the physican cannot actually see 
whether inflamation is present in the patient’s appendix without per- 
forming surgery. Instead he notes the type of complaints the patient 
has, checks for febrility, notes the counts on white cells, neutrophiles, 
cosonophiles, ete. Actually he considers a wide variety of variables 
that might help him in establishing the diagnosis of acute appendicitis 
before operating, either direetly or by excluding other possibilities. 

The question naturally arises as to the least number of variables 
that might be used to determine the presence of pathological 
appendicitis. 

Two physicians audited 2094 primary appendectomies in 16 hospitals. 
The results indicate that about fifty per cent of the males and thirty 
per cent of the females were found, on examination of the removed 
tissues, to have had inflammation present or a concentration of pus cells 
in the lumen. Their data is examined and a model, based upon the 
conceptual approach of factor analysis is presented and discussed which 
allows the prediction of the proportion of patents in a particular hospital 
that can be expected to show pathological tissue. 

The preliminary results of this particular application of the model 
are presented and other applications of the model to specific biological 
areas are noted. 


7 
=, 
+. 
“fe 
by 
ane: 
| 


THE BIOMETRIC SOCIETY 


International 


The Society is sponsoring a Symposium on Quantitative Methods 
in Pharmacology, to be held at the University of Leyden, Netherlands, 
during May 10-13, 1960. The programme is expected to include papers 
on sequential analysis, non-parametric techniques, mixtures of drugs, 
chronic toxicity, and drug standardization. The chairman of the 
organizing committee is Dr. D. K. de Jongh, Paramaco-Therapeutisch 
Institute NEDCHEM, van Limburg Stirumstraat 39-43, Amsterdam W. 

The International Statistical Institute is holding its next session in 


Tokyo from May 31 to June 9, 1960. Two meetings, on statistical” 


methodology for medical research, and on statistical methodology for 
biologists, are being jointly sponsored by the Biometric Society. The 
organizers of these meetings are A. Bradford Hill and L. L. Cavalli- 
Sforza. Other topics of biometric interest include population dynamics 
(E. A. Cornish), planning of experiments (W. J. Youden), multivariate 
analysis (M. 8. Bartlett), censuses and sample surveys (A. R. Eckler). 
Further information may be had from the Permanent Office of the 
L.S.I., 2 Oostduinlaan, The Hague, Netherlands. 


Belgian Region 

At the meeting of the Société Adolphe Quetelet on March 26, 1959, 
the following were elected to serve on Council for the term 1959-61: 
Mr. Henry, President; Messrs. De Nayer, Lecrenier, Welsch, Reuse, 
and Van den Hende, Vice-Presidents; Mr. L. Martin, Secretary; Miss 
A. Lenger, Assistant to the Secretary; Mr. A. Rotti, Treasurer; Messrs. 
Bontemps, Roggen, Lion, Members. 


Brazilian Region 


The Brazilian Region of the Biometric Society held its fourth mecting 
on July 12, 1957 at the National Museum. Rio de Janerio jointly with 
the Brazilian Association for the Advancement of Science. Papers 
given included J. M. P. Memoria and M. Barbosa, “Statistical analysis 
of the blood groups in Belo Horizonte, Brazil,’ C. G. Fraga, Jr., “Sta- 


498 


7 
( 
n 
I 
a 
q 
W 
sl 
4 
ti 
in 
m 
Wi 
| 
- m 
te 
Bi 
Re 
L. 
Ch 
M 
M 
( 
Dr 
| 
|_| 


THE BIOMETRIC SOCIETY 499 


tistical analysis of 2 X 2 experiments; C. G. Fraga, Jr. and A. 
Conagin, ‘“‘Adequate error terms for testing hypotheses in field experi- 
ments.” 

The fifth meeting of the Region took place on March 24, 1958 at the 
Instituto Biolégico, Sio Paulo. The following papers were presented 
and discussed: M. Rocha e Silva, “Discrimination of pharmacologically 
active substances from ‘yes-or-no’ responses in parallel and simultaneous 
assays;” Elza $8. Berqué, “A note on the expression of the Wald se- 
quential probability ratio test for the mean of a normal distribution 
with unknown variance (sequential ¢-test) ;” I’.Pimental Gomes, “‘Analy- 
sis of a systematic factorial experiment;”’ R. A. da Silva Leme, “An 
application of discriminant analysis; R. M. Santos and A. Conagin, 
“Preliminary studies to determine a point scale for the quality evalua- 
tion of coffee beverage;’ W. 8. P. Leser, “A note on the validity of 
intelligence tests in the selection of candidates to medical schools.” 

The latter scientific session was preceded by the 1958 business 
meeting at which the ballots for regional officers for the following year 
were counted. The following officers were elected; President— 
I. Pimentel Gomes; Secretary—P. Mello lreire; Treasurer—A. Grosz- 
mann. Regional Committee members elected in 1958 for a three-year 
term are C. G. Fraga, Jr. and J. M. P. Memoria; there are four other 
Committee members with unexpired terms: F. G. Brieger, L. Freitas 
Bueno, A. M. Penha, and A. Conagin. 

Plans are at present underway for the 1959 meeting of the Region. 


Région Frangaise 


‘Two papers were presented at a meeting held on June 17, 1959: 
L. Richard: “Interdépendance des éléments de la nutrition minérale des 
végétaux,” 
S. Ledermann: ‘Application de l’analyse factorielle 4 |’étude de la 
mortalité par age.” 


CHANGES IN MEMBERSIITP 
(May 1 July 15, 1959) 
Changes of Address 
Mr. Christopher R. Baines, Department of Statistics, University of 
Aberdeen, Meston Walk, Aberdeen, Scotland. 
Mr. Robert Beckhofer, Sibley School of Mechanical Engineering, 
Cornell University, Ithaca, New York, U.S.A. 


Dr. J. D. Biggers, 36th Street at Spruce, Philadelphia 4, Pennsylvania, 
U.S.A. 


Ae 
| 
iy 
A 
h “4 
Is 
A- 
| 
| 


500 BIOMETRICS, SEPTEMBER 1959 


Dr. Allan Birnbaum, Institute of Mathematical Sciences, New York 
University, New York 3, N. Y., U.S.A. 

Mr. Philippe F. Bourdeau, B. P. 173, Astrida, Ruanda, Urundi, 
Belgium. 

Dr. Ralph A. Bradley, Department of Statistics, Florida State Uni- 
versity, Tallahassee, Florida, U.S.A. 

Dr. Irwin D. J. Bross, Roswell Park Memorial Institute, 666 Elm 
Street, Buffalo 3, New York, U.S.A. 

Dr. A. Brown, Department of Mathematics, University of Melbourne, 
Carlton, N. 3 Victoria, Australia. 

Mr. L. C. Chapas, 49 Church Lane, Harpurhey, Manchester 9, England. 

Miss Irma Coons, Statistics Section, General Foods Research Center, 
Tarrytown, New York, U.S.A. 

Mr. Tiberius Cunia, 1305 Chemin Bois Franc, Ville St. Laurent, 
Montreal, Quebec, Canada. 

Prof. John S. de Cani, Norwegian School of Economics and Business 
Administration, Bergen, Norway. 

Dr. James W. Degan, The Mitre Corporation, Box 208, Lexington 73, 
Massachusetts, U.S.A. 

Dr. Barend de Loor, 109 Lynnwood Road, Pretoria, S. Africa. 

Mr. William J. Dewey, 308-A Eagle Heights Apts., Madison, Wisconsin, 
US.A. 

Miss Martha W. Dicks, Poultry Department, University of Connecticut, 
Storrs, Connecticut, U.S.A. 

Prof. Daniel Dugue, 24 rue Jean-Louis Sinet, Seaux (Seine) France. 

Mrs. A. D. Durbin, c/o Mrs. Scott, Greenyfields Cottage, Rowton, 
Chester, England. 

Dr. A. R. G. Emslie, Animal Research Institute, Research Branch, 
Department of Agriculture, Ottawa, Ontario, Canada. 

Mr. William A. Ericson, 1039 Massachusetts Avenue, Apt. 10B, Cam- 
bridge 38, Massachusetts, U.S.A. 

Mrs. Polly Feigl, 3717 Chestnut Street, Apt. 207, Philadelphia 4, 
Pennsylvania, U.S.A. 

Mr. N. Ferguson, 16 Borden Lane, Sittingbourne, Kent, England. 

Prof. John Henry Gaddum, A. R. C. Inst. of Human Physiology, 
Babraham, England. 

Dr. R. Gnanadesikan, 6731 Highland Avenue, Cincinnati 36, Ohio, 
US.A. 

Mr. Phillip P. Gray, Wallerstein Laboratories, Mariners Harbor, 
Staten Island 3, New York, U.S.A. 

Mr. Edwin F. Grey, 3206 A Wakefield Road, Harrisburg, Pennsylvania, 
US.A. 


| 
| 
,| 
i 


THE BIOMETRIC SOCIETY 501 


Mr. Floribert A. L. G. Jurion, 1, rue Defacqz, Bruxelles 5, Belgium. 

Miss Glen Rae Hanemann, University of Michigan, 1200 East Ann 
Street, Ann Arbor, Michigan, U.S.A. 

Mr. David Hogben, Western Electric Company, 100 Central Avenue, 
Kearny, New Jersey, U.S.A. 

Dr. Rex L. Hurst, Statistical Laboratory, lowa State College, Ames, 
Iowa, U.S.A. 

Mr. Alois Kaelin, Bolleystr. 22, Zurich 6, Switzerland. 

Mr. Kenichi Kayanagi, The Union Scientists and Engineers, Osaka 
Syosen Building, Kyobashi, Chyuo-ku, Tokyo, Japan. 

Dr. K. Kishen, “Hari Bhawan,” 7, Pani Laxmi Bai Marg, Lucknow, 
India. 

Prof. Yusaku Komatsu, 845, Totsuka-machi 4, Shinjyuku-ku, Tokyo, 
Japan. 

Agr. Lic. Mats Lagervall, Institutionen for Husdjursgenetik, Kungl. 
Lantbrukshogskolan, Uppsala 7, Sweden. 

Mr. Richard A. Lamm, 35 Van Zandt Drive, Pearl River, New York, 
U.S.A. 

Prof. Henry Marchand, Boite Postale 60-49, Dakar (A.E.I.), Senegal, 
West Africa. 

Mr. Dale I. Matzinger, Department of Genetics, North Carolina 
State College, Raleigh, North Carolina, U.S.A. 

Miss Ethelyne L. McBee, Westlake School for Girls, 700 N. Faring 
Road, Los Angeles 24, California, U.S.A. 

Mr. Louis Munan, 1501 New Hampshire Avenue, N.W., Washington 6, 
D. C., U.S.A. 

Mr. G. T. Park, 8 Park Avenue, North Shields, England. 

Dr. Raleigh E. Patterson, Vice Pres. for Agriculture, Texas A and M 
College System, College Station, Texas, U.S.A. 

Prof. W. L. M. Perry, Pharmacological Laboratory, University New 
Buildings. Teviot Place, Edinburgh 8, Scotland. 

Dr. Gisela Reissig, Friesickestr. 30, Berlin-Weissensee, Germany. 

Prof. Dr. Heinz Rethmann, Altonaer Str. 173, Neumunster, Germany. 

Mr. Julien Ff. M. Ronehaine, B. P. 208, Butembo, Kivu, Belgian Congo. 

Dr. Chr. L. Rumke, Fazanstraat 34, Badhoevedorp, Netherlands. 

Prof. Kenziro Saio, Faculty of Agficulture, Toyko University, Bunkyo- 
ku, Toyko, Japan. 

Prof. Henry 5. Scheffé, Statistical Laboratory, University of California, 
Berkeley 4, California, U.S.A. 

Dr. Morton D. Schweitzer, Sibley School of Mechanical Engineering, 
Cornell University, Ithaca, New York, U.S.A. 

Mr. Harry H. Shorey, Department of Mutomology, University of Cali- 
fornia, Riverside, California, U.S.A. 


| 
iat 
; 
= 
rig 
ie: 
+3 
| 
i 
aN 
’ 
a 
page 


502 BIOMETRICS, SEPTEMBER 1959 


Dr. Walter G. Smith, Strathern, Laburnum Grove, Cleadon, Nv. 
Sutherland, Co. Durham, England. 

Mr. Charles R. Sormani, 104 Stratfurd Road, West Hempstead, New 
York, U.S.A. 

Dr. Robert G. D. Steel, 337 Warren Hall, Cornell University, Ithaca, 
New York, U.S.A. 

Miss Enes Barbara Taucci, 659 louse Avenue, Akron 10, Ohio, U.S.A. 

Dr. Alan E. Treloar, 11801 Gainsboro Road, Rockville, Maryland, 
U.S.A. 

Mr. Thomas N. Vinson, General Delivery, Wildwood, New Jersey, 
U.S.A. 

Dr. G. 8S. Watson, Department of Mathematics, University of ‘Toronto, 
Toronto 5, Ontario, Canada. 


New Members 
At Large 


Dr. M. 8. Czarnowski, ul. Przegorzaly 74, Krakow 24, Poland. 

Prof. Dr. Jan Czekanowski, Ulica Kanclerska 14, Poznan, Poland. 

Dr. Benedykt Halicz, al. Nickiewicza 10/14, Lodz, Poland. 

Mr. Jaroslav Janko, Ke Warlovu, Praha 2, Czechoslovakia. 

Mr. M. K. H. Khan, 42, Rabbani Road, Old Anarkali, Lahore, Pakistan. 

Dr. Halina Milicer, Wyzwolenia 10/187, Warsaw, Poland. 

Dr. Josef Tukaszewica, ul. Katowicka 29-2, Wroclaw 5, Poland. 

Dr. Adam Wanke, Prof. of Anthropology, ul. Kuznieza 35, Wroclaw, 
Poland. 


Belgian Region 


Mr. Victor Adehm, 43 Route D’Echternach Moersdorf, Grand-duche de 
Luxembourg, Belgium. 

Mr. A. Burny, 147 rue de Strichon, Mellery, Belgium. 

Mr. Compere, 15 rue Pierguin, Gembloux, Belgium. 

Mr. J. Francois, Av. des Aubepines, 2, Kruainem, Belgium. 

Mr. Pierre Gilbert, 20 Boulevard General Jacques, Brussels 5, Belgium. 

Mr. Carlo Louis Goormans, ¢/o INEAC, Dsp. Bukava, Mulungu, 
Belgian Congo. 

Mr. J. A. G. Hemptinne, Huileries du Congo Belge, Yaligimba, Belgian 
Congo. 

Mr. C. Muhle Larsen, Institut de Popliculture, Grammont, Belgium, 


British Region 

Mr. B. O. Bartlett, 4, Polstead Road, Oxford, Hngland. , 

Dr. Ef. W. Bentley, Ministry of Agriculture, Hook Rise, Tolworth 
Surbiton, Surrey, Mngland. 


] 
| 
] 
] 
| 
wa | 4 


THE BIOMETRIC SOCIETY 503 


Miss 8S. Brown, 7, Neville Avenue, New Malden, Surrey, England. 

Dr. W. R. Buckland, London Transport Executive, 55 Broadway, 
Westminster, SW 1, England. 

Miss M. E. Davis, Rothamsted [Experimental Station, Harpenden, 
Hertfordshire, England. 

Dr. N. R. Draper, 3. Malvern Terrace, Winchester Road, Southampton, 
England. 

Mr. R. F. George, National Coal Board, Hobart House, Grosvenor 
Road, London SW 1, England. 

Mr. D. A. Holland, Kast Malling Research Station, Maidstone, Kent, 
Ingland. 

Mr. J. E. Nash, Hydraulics Research Station, Hawbery Park, Walling- 
ford, Berks., England. 

Mr. W. A. Pridmore, Messrs. Reckitt and Sons, P. O. Box 78, Hull, 
Yorkshire, England. 

Mr. F. J. Scott, Unit of Biometry, 7, Keble Road, Oxford, England. 

Mr. R. C. Tomlinson, 19, Hillview Avenue, Kenton, Harrow, 
Middlesex, England. 

Mr. Dennis Hugh Ward, National Coal Board, Shade House, Bolton 
Road, Pendlebury, Manchester, [ngland. 


Eastern North American Region 


Mr. Edward Newman Brandt, Jr., University of Oklahoma Medical 
Center, 800 NE 18th, Oklahoma City, Oklahoma, U.S.A. 

Mr Ek. I. Burdock, Dept. of Mental Hygiene, 722 West 168 Street, 
New York 32, N. Y., U.S.A. 

Mr. John O'S. Francis, Diabetes Research Unit, 77 Warren Street, 
Brighton 35, Massachusetts, U.S.A. 

Mr. Dennis Kk. Johnson, Arthur 1). Little, Ine., 15 Acorn Park, Cam- 
bridge 40, Massachusetts, U.S.A. 

Mr. Robert Kaplan, 3601 Wisconsin Avenue, NW, Apt. 702, Washing- 
ton 16, D. C., U.S.A. 

Dr. Steven C. King, Mt. Hope Poultry Farm, Inc., Batavia, New York, 
U.S.A. 

Dr. Ken-ichi Kojima, Department of Statistics, North Carolina State 
College, Raleigh, North Carolina, U.S.A. 

Mr. Stuart H. Mann, Box 243, Champaign, Illinois, U.S.A. 

Mr. Felix E. Moore, School of Public Health, University of Michigan, 
Ann Arbor, Michigan, U.S.A. 

Shirley DeBobes Sternberg, 655 2. 14th Street, New York 9, N.Y. 
U.S.A. 

Mr. Donovan J. Thompson, Department of Biostatistics, University of 
Pittsburgh, Pittsburgh 13, Pennsylvania, U.S.A. 


4 
Thee 
| ‘3 
i 
| 
| 
i 
' 


504 BIOMETRICS, SEPTEMBER 1959 


Mr David L. Wallace, Department of Statistics, University of Chicago, 
Chicago 37, Illinois, U.S.A. 

Dr. Irwin B. Wood, Parasitic Chemotherapeutics Section, American 
Cyanamid, Pearl River, New York, U.S.A. 


French Region 


Dr. Victor Jalibert, 16 Bd. de Shasbourg, Paris X°, France. 
Dr. Alice Lotte, 2 rue Lavoisier, Paris (8), France. 


German Region 


Dr. med. Herbert Jordan, Badehaus, Bad Elster, Germany. 
Miss G. Mayer, Bundesanstalt I°. Wein-u. Obstbau, Klosterneuburg 
Bei Wien, Oesterreich, Austria. 


India 
Mr. Sri. K. Srinvasan, I.A.8., (Retd) Chairman, Coffee Board, Banga- 
lore-9, India. 


Japan 

Mr. Susumu Kikuchi, ¢/o Fac. of Tech., Osaka University, Nishi-Ogi- 
machi, Kita-ku, Japan. 

Mr. Shigeichi Moriguchi, Syoan Minami-machi 6, Suginami-ku, Japan. 

Mr. Yasuo Otsuka, National Sericultural Experiment Station, 1852, 
Hino-machi, Minami-Toma-gun, Toyko, Japan. 

Mr. Shoji Ura, Keio University, 794 Koganei-shi, Tovko, Japan. 

Mr. Sumiyasu Yamamoto, Department of Statistics, Nara Medical 
College, Kashihara-shi, Nara-ken, Japan. 

Mr. Seiji Yoshikawa, 1354 Hachimanmae, Kokubunji-machi, Wita- 
Tama-gun, Tokyo, Japan. 


Netherlands 

Dr. W. Lammers, Inst Volkagez. Utrecht, Netherlands. 

Dr. B. Veen, Steenstraat 130-3, Arnhem, Netherlands. 

Sweden 

Fil. lie Boris Khnrot, Svenska Sockerfabriks Aktiebolaget, Arlov, 
Sweden. 

Switzerland 


Prof. Dr. Eduard Batschalet, Mngelgasse 112, Basel (Schweiz) Switzer- 
land. 


7 
] 
4 


THE BIOMETRIC SOCIETY 505 


Dr. Ing. B. H. Messikommer, ¢,o0 Ciba Aktiengesellschaft, Klybecks- 
trasse 141, Basel, Switzerland. 

Dr. Fritz Hans Schwarzenbach, Theodor Kocher, Universitat, Bern, 
Switzerland. 

Prof. Dr. W. Wegmuller, Aegertenstrasse 1, Bern, Switzerland. 


Western North American Region 


Mr. Leonard J. Goldberg, 55 Kathryn Drive, Pleasant Hill, California, 
U.S.A. 

Dr. George B. Mellon, Research Council of Alberta, 87 Avenue and 
114 Street, Edmonton, Alberta, Canada. 

Dr. Francis L. Stanonis, Carter Oil Company, Durango, California, 
U.S.A. 

Mr. Frank Tusko, 2109 Wesbrook Place, Vancouver 8, B. C., Canada. 

Mr. John M. Weiner, 12553 Lorne Street, North Hollywood, California. 

Dr. Alvin D. Wiggins, Hanford Laboratories Operation, Richland, 
Washington. 


: 
3 
4 
] 
| 
ere 
Mae 


NEWS AND ANNOUNCEMENTS 


Members are invited to transmit to their National or Regional Secretary 
(if members at large, to the General Secretary) news of appointments, 
distinctions, or retirements, and announcements of professional interest. 


BIOMETRICS ADDRESS CHANGE 


Kffective September 1, 1959, the Business Office and Editor’s 
Office of Biometrics will move from the Department of Statistics, The 
Virginia Polytechnic Institute to the Department of Statistics, The 
Florida State University, Tallahassee, Florida, U.S.A. The new ad- 


dress should be used for all correspondence relating to direct subscrip- 


tions to Biometrics and to back issue orders. 

Members of the Biometric Society should not send changes of 
address or any correspondence relating to their subscriptions to Bio- 
metrics to the Business Office of Biometrics. Members should continue 
to address all such correspondence to the Business Office of the Bio- 
metric Society, 509 West Hill Road, Knoxville 19, Tennessee, U.S.A. 

Correspondence regarding manuscripts and other material for 
Biometrics should be directed to the Tallahassee address. While the 
change in offices is being effected, there may be some delays in corre- 
spondence but all matters relating to Biometrics will be handled as 
promptly as possible. 


NEW ADDRESS: BIOMETRICS 
DEPARTMENT OF STATISTICS 
THE FLORIDA STATE UNIVERSITY 
TALLAHASSEE, FLORIDA, U.S.A. 


Irwin D. J. Bross will head the Department of Statistics at Roswell 
Park Memorial Institute, commencing in September 1959. His ad- 
dress will be: Department of Statistics, Roswell Park Memorial Institute, 
666 Elm Street, Buffalo 3, New York. During the past 7 years, Dr. 
Bross has been statistical consultant at Cornell University Medical 
College, the Sloan-Kettering Institute, and the associated hospitals. 
lor three years prior to this, he was Research Associate in the Depart- 


506 


CS 


Thro 


| 
! 
] 
a 
f 
] 
a 
c 
8 
( 


NEWS AND ANNOUNCEMENTS 507 


ment of Biostatistics, School of Public Health and Hygiene at Johns 
Hopkins University. 

Margaret P. Martin formerly with the Upstate Medical Center of 
the State University of New York is presently a member of the staff of 
the School of Hygiene and Public Health, Johns Hopkins University, 
715 N. Wolfe Street, Baltimore 5, Maryland. 

Saban Karatas, a staff member of Ataturk Universitesi of Turkey is 
studying at present at Cornell University in Biometry and Animal 
Breeding. Dr. Karatas is an exchange visitor as an International Co- 
operation Administration participant through a contract with the Uni- 
versity of Nebraska. 

Mordecai H. Gordon, formerly Assistant Director, Veterans Ad- 
ministration, Central Neuropsychiatric Research Unit at Perry Point, 
Maryland, is presently employed at the National Institute of Neuro- 
logical Diseases and Blindness, National Institutes of Health, U. 8S. 
Public Health Service. Dr. Gordon is participating in “A Collaborative 
Project for the Study of Cerebral Palsy, Mental Retardation, and other 
Neurological and Sensory Disorders of Childhood.” 

Jerry S. Olsen joined the staff of the Oak Ridge National Laboratory 
as a Geobotanist in the Ecology Group in the fall of 1958. He was 
formerly an Associate Forest Ecologist at the Connecticut Agricultural 
Experiment Station. 

R. E. Patterson, formerly Vice Director of the Texas Agricultural 
Experiment Station is now Vice President for Agriculture, Texas A 
and M College System, College Station, Texas. 

Joseph Gani, a member of the Department of Mathematics, Uni- 
versity of W. A., Nedlands, Western Australia is spending one year at 
Columbia University and several months at Stanford University in 
California before returning to his post as Reader in Mathematical 
Statistics at the University of Western Australia. 

George F. Sprague, formerly an Agronomist at Iowa State College 
at Ames, Iowa, is presently employed at U. S. Department of Agri- 
culture, Agricultural Research Services at Beltsville, Maryland. 

Irma Coons has left the Westinghouse Electric Corporation for a 
position with General Foods Research Center at Tarrytown, New York. 

Nicholas E. Manos has changed his position from Chief Statistician, 
Air Pollution Medical Program to Chief of the Science Program Analy- 
sis, National Aeronauties and Space Administration. 

Wilford L. Davis has taken a position with Westinghouse Electric 
Corporation Atomic Power Division in Pittsburgh. 

Edward B. Seligmann, Jr., formerly at Port: Detrick, Maryland is 
presently employed as Chief of the Reference Standards Seetion, Labo- 


2 
] 
¥ 
| 
Wo 
4 
4 
| 
Va 


508 BIOMETRICS, SEPTEMBER 1959 


ratory of Control Activities, Division of Biologic Standards, National 
Institutes of Health, Bethesda, Maryland. 

Allegra H. Rodgers has left E. R. Squibb and has taken a position 
with Bristol Myers Products Division, Hillside, New Jersey. 

Donald W. Bailey has left the Zoology Department of the University 
of Kansas to take a position of Chief of the Genetics Unit, Laboratory 
Aids Branch, National Institutes of Health, Bethesda, Maryland. 

David B. Christian is employed by Bendix Products-Missiles, in 
Mishawaka, Indiana. 

Joe L. Powell is taking a Master’s Degree in the Biostatistics Section 
at Tulane Medical School in New Orleans, Louisiana. 

Louts Munan is with the Regional Office of the W. H. O., Pan 
American Health Organization. In 1957 and 1958 Mr. Munan was 
Visiting Professor in Medical Statistics under a Fulbright Lectureship 
Award at the Universities of San Marcos (Lima, Peru) of Quito (Quito, 
Ecuador) and of Guayaquil (Guayaquil, Ecuador). 

Geoffrey S. Watson has joined the staff of the Department of Mathe- 


matics at the University of Toronto. Dr. Watson was formerly at the - 


Australian National University at Canberra. Since leaving Australia 
he has spent his time at the Biological Laboratory in Cold Spring 
Harbor, the Mathematics Department of Princeton University, and 
the Department of Statistics at the University of North Carolina. 

James W. Degan has left the Lincoln Laboratory at MIT to assume 
the position of Head, Human Factors Department of the Mitre Corpora- 
tion, Lexington, Massachusetts. 

Allan Birnbaum has been appointed Associate Professor of Mathe- 
matical Statistics in the Department of Mathematics and the Institute 
of Mathematical Sciences at New York University. 

Ralph A. Bradley has joined the faculty of The Florida State Uni- 
versity at Tallahassee, Florida. He will be Chairman of the Depart- 
ment of Statistics at the University. During the past nine years, Dr. 
Bradley has been Professor of Statistics and consultant to the Agri- 
cultural Experiment Station at the Virginia Polytechnic Institute. 

Jacob E. Lieberman, division of research services of the biometrics 
branch, National Institutes of Health, Bethesda, Maryland, began a 
month’s work in June on biological statistics with Dr. Kelsey Milner at 
the Rocky Mountain National Laboratory in Hamilton, Montana. 


FLORIDA STATE UNIVERSITY DEPARTMENT OF 
STATISTICS 


The Florida State University at Tallahassee, Florida has estab- 
lished a Department of Statistics effective July 1959. The initial 


Th 


or mes ge 


ome 


I 
i 
n 
ti 
ti 
fc 
a 
J 


NEWS ANI) ANNOUNCEMENTS 509 


faculty will consist of Ralph A. Bradley, Chairman, John L. Bagg and 
Lonnie L. Lasman. Programs of study leading to the Bachelor of 
Science and Master of Science degrees in statistics will be initiated in 
the Fall 1959 semester and advanced graduate work will be developed 
in the near future. The Department of Statistics will provide uni- 
versity-wide training and consulting. Inquiries regarding the program 
will be welcome and some assistantships will be available for graduate 
students. 


KANSAS STATE UNIVERSITY 


Effective July 1 a Department of Statistics was created at Kansas 
State University in Manhattan, Kansas. This department and the 
Statistical Laboratory, which has existed for thirteen years, are under 
one head, Dr. H. C. Fryer. Other members of the department are Drs. 
Arlin M. Feyerherm and Stanley Wearden, and instructors Gary F. 
Krause and Robert 8S. Cochran. For a number of years the Master’s 
degree in statistics has been offered through the Department of Mathe- 
matics. The new Department of Statistics hopes to be in a position 
to offer the Ph.D. degree in statistics within a few years. 


EDUCATIONAL TESTING SERVICE FELLOWSHIPS 


Dr. Harold Gulliksen announces that The Educational Testing 
Service, Princeton, N. J., is offering for 1960-61 its thirteenth series of 
research fellowships in psychometrics leading to the Ph.D. degree at 
Princeton University. Open to men who are acceptable to the Graduate 
School of the University, the two fellowships each carry a stipend of 
$2,650 a year and are normally renewable. Tellows will be engaged in 
part-time research in the general area of psychological measurement at 
the offices of the Educational Testing Service and will, in addition, 
carry a normal program of studies in the Graduate School. 

Suitable undergraduate preparation may consist either of a major 
in psychology with supporting work in mathematics, or a major in 
mathematics together with some work in psychology. However, in 
choosing fellows, primary emphasis is given to superior scholastic at- 
tainment and research interests rather than to specific course prepara- 
tion. 

The closing date for completing applications is January 1, 1960. In- 
formation and application blanks will be available about September 15 
and may he obtained from: Director of Psychometric Fellowship Pro- 
gram, Mducational Testing Service, 20 Nassau Street, Princeton, New 
Jersey. 


| 
4 
| 
pee 
an 
i 
iz 


o10 BIOMETRICS, SEPTEMBER 1959 


TECHNICAL TRANSLATIONS 

The Department of Commerce, through its Office of Technical 
Services (OTS), has started a new journal Technical Translations. 

This journal appears twice a month and contains abstracts of 
foreign scientific publications (including mathematics) for which trans- 
lations (in many cases from the Russian) are available through OTS. 

Further details may be obtained from the Office of Technical Serv- 
ices, U. S. Department of Commerce, Washington 25,D.C. | 


NEW YORK STATE EMPLOYMENT 


DIRECTOR OF HEALTH STATISTICS—New York State De- 
partment of Health, Albany, N. Y. $12,346 to $14,476 in 5 annual 
salary increases. Examination open to qualified U.S. citizens. Candi- 
dates must possess a doctor’s degree in science, public health, or medi- 
cine, including or supplemented by one academic year in a school of 
public health with specialization in biostatistics. In addition candidates 
must show extensive experience in biostatistics, involving project 
planning. Two years of the experience must have been in a position 
which combined important administrative responsibilities with pro- 
fessional responsibilities in biostatistics. It is anticipated that the oral 
test and/or selection interview will be held at the APHA meeting in 
Atlantic City between October 19 and October 23, 1959. Tor detailed 
announcements and application blanks write to Recruitment Unit, Box 
91, New York State Civil Service Department, The State Campus, 
Albany 1, New York. 


Advertisement 


i 
} 
i 
4 
| 
4 | 
| 
e 
° 


JOURNAL OF 
AMERICAN STATISTICAL 


ASSOCIATION 
Volume 54 Number 286 June 1959 
ARTICLES 
A Guide to the Literature on Statistics of Religious Affiliation with Refer- 
ences to Related Social Studies................... Benson Y. LANDIS 


Increase in Rent. of Dwelling Units From 1940 to 1950... Margaret G. Rei 
The Demand for Fertilizer in 195-4: Inter-State Study... Z\1 GrimicHes 
Parameter Estimates and Autonomus Growth 

, W. A. NEISWANGER AND T. A. YANCEY 


On Variances of Ratios and Their Differences in Multi-Stage Samples 
Kiso anp IRENE HEss 


Accuracy Requirements for Acceptance Testing of Complex Systems 
C. R. Gates ano J. P. FEAREY 


Correlation Between Sample Means and Ranges 
BERNARD OSTLE AND GEORGE P. STECK 


Problems in Mental Test Theory Arising from Errors of Measurement 
FrevErtc M. Lorp 


On the Problem of Matching Lists by Samples 
W. Epwarps DEMING AND GERALD J. GLASSER 


BOOK REVIEWS 


Abstracts of Papers Presented at the HI8th Annual Meeting 


AMERICAN STATISTICAL ASSOCIATION 
1757 K Street, N.W., Washington 6, D.C. 


For further information, please contact the American Statistical Association, 
1757 K Street, N.W., Washington 6, 1).C. 


| 
| 
ie 
if 
{ 
] 
as 
{ 


TECHNOMETRICS 


A Journal of Statistics for the 
Physical, Chemical, and Engineering Sciences 


Vol. 1, No. 2 May, 1959 


CONTENTS 


Measurements Made by Matching with Known Standards 
W. J. Youpen, W. S. Connor anv N. C. SEVERO 


Random Balance Experimentation F. Ek. SATTERTHWAITE 
The Application of Random Balance Designs Tuomas A. BuUDNE 


Discussion of the Papers of Messrs. Satterthwaite and Budne...W. J. YouvEN, 
O. Kempruorne, J. W. Tukey, G. E. P. Box, ano J. 8S. HuNTER 


Quick Analysis Methods for Random Balance Screening Experiments 
F. J. ANSCOMBE 


Vol. 1, No. 3 August, 1959 


CONTENTS 


Simplified Estimators for the Normal Distribution when Samples are Singly 
Censored or Truncated A. CLIFForD COHEN, JR. 


Control Chart Tests Based on Geometric Moving Averages. .S. W. RoBEerts 
The Measuring Process JoHN MANDEL 
Factorial Experiments in Life Testing 


The Use of LaGrange Multipliers with Response Surfaces 
A. W. UMLAND W. N., 


A Statistical Model for Evaluating the Reliability of Safety Systems for 
Plants Manufacturing Hazardous Products Louis B. KAHN 


Technometrics is published quarterly in February, May, August, and 
November. The annual non-member subscription rate is $8.00. To members of 
the American Statistical Association and the American Society for Quality 
Control the rate is $6.00. Inquries should be addressed to either Technometrics, 
American Statistical Association, 404 Beacon Bldg., 1757 K Street N. W., 
Washington 6, D. C. or Technometrics, American Society for Quality Control, 
Rm. 6197, Plankinton Bldg., 161 Wisconsin Ave., Milwaukee 3, Wisconsin. 


j 
F 
he 
q 
4 
q 
q 
i 
; 4 | 
a 
i 
4 
| 
| 
| 
| 
| 
Aim q ‘ 
i 
| 


| 
= 
| 
a 


INFORMATION FOR CONTRIBUTORS 


MANUSCRIPTS 


Contributions for Biometrics may be addressed to Dr. Ralph A. Bradley, Depart- 
ment of Statistics, The Florida State University, Tallahassee, Florida, US.A.; 
authors residing in the following Society Regions can expedite consideration of papers 
by submitting them to the appropriate Associate Editor, namely; BRITISH RE- 
GION: Dr. S. C. Pearce, East Malling Research Station, East Malling, Maidstone, 
Kent, England; AUSTRALASIAN REGION: Dr. E. A. Cornish, University of 
Adelaide, Adelaide, Australia; FRENCH REGION: Dr. Georges Teissier, Faculté 
des Sciences de Paris, 1 rue V. Cousin, Paris, France. QUERIES, NOTES, and 
related correspondence should be directed to Dr. D. J. Finney, Department of 
Statistics, University of Aberdeen, Meston Walk, Old Aberdeen, Scotland. 

MANUSCRIPTS must be submitted in triplicate, with typescript doublespaced 
throughout. Marginal notes may obviate typographical difficulties presented by 
complicated formulae or tables—authors should not attempt editorial instructions 
or markings for the printer. TABLES should be identified by arabic number and 
by a short descriptive title. ILLUSTRATIONS should also be identified by arabic 
number and by a brief caption. (Captions should not be included in illustrations, 
but should be typewritten collectively on an accompanying sheet.) Originals 
should be approximately 8.5 x 11 in. (21.5 x 28 cm.). The original of each chart, 
diagram, or graph should be executed in black on white drawing paper or board, on 
blue tracing linen, or on coordinate paper ruled in blue only; coordinate lines to be 
reproduced should be ruled in black. For printing, illustrations may be reduced to 
¥ or \ original dimensions. Lines should therefore be of sufficient thickness, and 
decimal points, periods, and stippled dots should be solid black circles large enough 
to reproduce well. Lettering and numerals should be at least 1 mm. high when 
reproduced in a cut 3 in. (7.5 cm.) wide. Photographs should be prints on glossy 
paper with strong contrasts, and if grouped in a plate should be mounted contig- 
uously. All tables and illustrations should be mentioned explicitly in the text. 
REFERENCES (BIBLIOGRAPHIC) should be collectively listed alphabetically 
by author; textual citation by author and year is preferred. 


ABSTRACTS 


Abstracts of papers presented at meetings of the Biometric Society or of its 
regions are printed in Biometrics following such meetings. They should be submitted 
to the person designated to receive them for a particular meeting in exactly the form 
published in Biometrics (except for an Abstract Number), doublespaced on bond 
paper, and in duplicate. Use of formulae requiring display printing is to be avoided. 


ANNOUNCEMENTS, AND Biometric Society Reports 


International and regional reports and notices should be submitted by the 
appropriate officers of the Society and its Regions in duplicate doublespaced on 
separate sheets exactly as they are to be printed in Biometrics. Other material to 


be printed in News and Announcements should also be submitted doublespaced 
and in duplicate. 


Sustaining MEMBERS OF THE BIOMETRIC Society 


Abbott Laboratories 

American Cancer Society, Ine. 

Heisdorf and Nelson Farms, Inc. 

Merck, Sharp and Dohme Research Laboratories 
Schering Corporation 

Smith, Kline and French Laboratories 

E. R. Squibb and Sons 

Wallace Laboratories, Division of Carter Products 
Wyeth Institute of Applied Biochemistry 


— 
4 
ea 
fi be 
: 
i 


BACK ISSUES 


Back issues of Biometrics are available at the following postag«-paid 
prices in U.S.A. currency: 


Price per Price per 
Year Volume Number’ Single Number Volume(unbound) 
1945 1 1to6 $1.00 $6.00 
1946 2 1 to 6 1.00 6.00 
1947 3 lto4 1.50 5.00 
1948 4 1to4 1.50 5.00 
1949 5 lto4 1.50 5.00 
1950 6 lto4 1.50 5.00 
1951 7 lto4 2.00 8.00 
1952 8 lto4 2.00 8.00 
1953 9 .1to4 2.00 8.00 
1954 10 lto4 2.00 8.00 
1955 il 1lto4 2.00 8.00 
1956 12 lto4 2.00 8.00 
1957 13 lto4 2.00 8.00 
1958 14 1 to4 2.00 8.00 


Reprints of individual articles are not available except to authors at the 
time of printing. Three special issues are among the numbers listed 
above. They are: 


1947 Volume 3 Number 1 The Analysis of Variance 
1951 Volume 7 Number 1 Components of Variance 
1957 Volume 13 Number 3 The Analysis of Covariance 


Also available are: 
Fishery Reprint Series (Selected reprints from Vol. 5) $1.00 
Subject Index (Volumes 1-10) 1.00 
Proceedings, International Biometric Symposium, 
Campinas, Brazil, 1955. 1.00 


Inquiries, non-member subscriptions, and orders for back issues and 
other material listed above should be addressed to: Biometrics, DEPART- 
MENT oF Statistics, THE Fiorma State University, TALLAHASSEE, 
Frorma, U.S.A. 


. 
a 
4 
q 
| q 
1 
q 
| 
| 
— 
— 


Re 
— 


